
Victor Eduardo Pato Paulillo
Data Engineering Developer
Victor is a data engineer with six years experience in the fintech and edtech market, building data warehouses, data pipelines, modeling, analysis, dashboards, and marketing campaigns. He is an expert in Python and big data analytics and has experience as a product owner in implementing and sustaining a CRM platform.
Portfolio
Availability
Preferred Environment
Apache Airflow, Google Cloud Platform (GCP), SQL, Python, Business Intelligence (BI), ETL, Databricks, Amazon Web Services (AWS), Tableau, Redshift
The most amazing...
...thing I've created was a system and a dashboard which automatically obtains the conversion rate of marketing campaigns of more than 20 financial products.
Work Experience
Data Engineer
Chegg - Toptal
- Profiled and modeled data in a data warehouse for the client by analyzing 30 existing reports and 120 tables.
- Created Python jobs on Databricks to get data from the vendor's API and load it to Redshift tables with data quality and process control.
- Built, modeled, and maintained a data pipeline on Databricks to load data into Redshift and Tableau Server. Created a data mart structure with 20 tables that support more than 50 Tableau reports.
- Developed a data quality framework that runs after the client's data pipeline and a Tableau report to show and alert any issues with the data.
- Migrated the data structure of ten tables on Looker to Tableau Server by building the data pipeline for marketing tables.
Data Engineer
StoneCo
- Created batch and streaming data pipelines for business teams.
- Loaded data from external data providers on the Data Lake.
- Created a data quality system for data processing jobs.
Data Engineer
Banco Original
- Migrated ten on-premises ETL processes to Google Cloud Platform using Google Cloud Storage, Cloud Functions, Google Pub/Sub, and Google BigQuery.
- Developed a system with Hive and Power BI for more than 20 financial products and around 100 campaigns monthly. This automatically gets the conversion rate of marketing campaigns, specifically email, push, and ads,.
- Created an API integration with BigQuery data to post on Facebook Marketing API and Google Ads using the Google Cloud Platform tools: BigQuery, Cloud Function, and Pub/Sub.
- Worked as a product owner and developer to create data pipelines and data modeling to support a newly acquired marketing platform, Oracle Responsys, and adapt it to the company.
- Worked with product managers to get the correct information on the transactional database and develop it ETL to a data warehouse.
Business Intelligence Analyst Jr
Banco Original
- Supported the development and structure of the data lake environment on HDFS.
- Developed fifteen ETL processes at data sources located on the data lake (HDFS). Delivered them on Hive for analytics purposes to business users.
- Worked with marketing managers to develop data-driven sales strategies through in-depth analysis of customers' behavior.
- Developed product performance dashboards to be accessed by the sales and product teams.
- Created more than 20 customer audiences for marketing campaign journeys according to product analytics rules.
- Improved the performance of a customer service chatbot through analytics on JSON files.
- Worked with business and product teams to disseminate analytics best practices, improve their query performances, and get the correct rules to achieve their analysis goals.
Business Intelligence Intern
Banco Original
- Created reports to analyze the rentability and acquisition of customers.
- Worked with product managers to audit ten financial products on the analytics databases and compare them with the transactional system.
- Developed around twenty ad-hoc queries with SQL and SAS using the SAS Guide software.
- Created around 25 top campaigns through several fonts of data (e.g., customers that delay the credit card bill).
- Developed the data dictionary for the databases of ten financial products.
- Supporting the survey and control of informational data gaps.
- Strong knowledge of flux, rules, and specifications of bank products, like credit cards, loans, and overdraft to audit the bases and create campaigns and reports.
Experience
A Post on BigQuery Data on Facebook Marketing API
https://victor-paulillo.medium.com/post-bigquery-data-on-facebook-marketing-api-276516566bbeThe article shows a simplified version of the project and offers how to post data on Facebook Ads that can be used to build audiences for ad targeting or to set as an offline conversion.
Upon completing the project, it was possible to dive deep into customer analytics and machine learning models to improve advertisement performances and new marketing strategies involving Facebook Ads.
Offline conversions enable Facebook machine learning to better understand the best customer of your company, especially when you have products and services that aren’t distributed online.
Data Pipeline of Open Dataset of Brazilian Government Companies Registration
I was the only developer to build this process, that went from assembling the Airflow environment on a VM with Docker, downloading, and the analysis of the open dataset of the Brazilian government companies registry on Google Cloud Storage, the transformation of the dataset into a table using BigQuery, data quality validations, and loading the final table into Postgres table with Cloud SQL.
All of those steps were created on an Airflow DAG scheduled weekly.
Skills
Languages
SQL, Python
Other
Data Engineering, Google BigQuery, Google Data Studio, Data Analysis, Google Pub/Sub, Google Cloud Functions, Dashboards, Marketing Automation, Financial Products, Data Analytics, ELT, Data Warehousing, Agile Sprints, Financial Data, Financial Data Analytics, Marketing Campaign Design, Data Auditing, Data Quality, Data Warehouse Design, Data, ETL Development, Statistics, Economics, Engineering, APIs, Excel 365, Chatbots, Cloud Computing, Sales Strategy, Product Owner, Data Marts, Data Processing, Data Modeling, Data Architecture
Libraries/APIs
Pandas, Google Ads API, Facebook Marketing API, TensorFlow, NumPy
Tools
Apache Airflow, Microsoft Power BI, BigQuery, PyCharm, SAS Enterprise Guide, Google Cloud Composer, Amazon Athena, Tableau, Docker Compose
Paradigms
Agile, Business Intelligence (BI), ETL, Management, Functional Programming, Data Science
Platforms
Jupyter Notebook, Google Cloud Platform (GCP), Oracle Responsys, Amazon Web Services (AWS), Databricks, Zeppelin, Apache Kafka, Docker
Storage
Apache Hive, HDFS, Data Lakes, Redshift, Google Cloud Storage, Data Pipelines, Amazon S3 (AWS S3), PostgreSQL, Google Cloud SQL
Industry Expertise
Banking & Finance
Frameworks
Hadoop, Apache Spark
Education
Bachelor’s Degree in Production Engineering
Federal Institute of São Paulo (IFSP-SPO) - São Paulo, Brazil
Certifications
Modernizing Data Lakes and Data Warehouses with GCP
Coursera
Google Cloud Platform Big Data and Machine Learning Fundamentals
Coursera