Data Engineer
2021 - PRESENTChegg - Toptal- Profiled and modeled data in a data warehouse for the client by analyzing 30 existing reports and 120 tables.
- Created Python jobs on Databricks to get data from the vendor's API and load it to Redshift tables with data quality and process control.
- Built, modeled, and maintained a data pipeline on Databricks to load data into Redshift and Tableau Server. Created a data mart structure with 20 tables that support more than 50 Tableau reports.
- Developed a data quality framework that runs after the client's data pipeline and a Tableau report to show and alert any issues with the data.
- Migrated the data structure of ten tables on Looker to Tableau Server by building the data pipeline for marketing tables.
Technologies: Redshift, Databricks, SQL, Python, APIs, Pandas, Data Pipelines, Data Modeling, Data Warehousing, Data Marts, Tableau, Dashboards, Amazon Web Services (AWS)Data Engineer
2021 - 2021StoneCo- Created batch and streaming data pipelines for business teams.
- Loaded data from external data providers on the Data Lake.
- Created a data quality system for data processing jobs.
Technologies: Apache Airflow, Apache Kafka, Hadoop, Apache Hive, ETL, Python, Amazon S3 (AWS S3), Redshift, SQL, ELT, Data Quality, Data Processing, Jupyter Notebook, Pandas, NumPy, Data Engineering, Data Modeling, Data Architecture, Data Warehouse Design, Data Warehousing, Financial Data, Data Pipelines, Data Marts, Banking & Finance, Amazon Web Services (AWS), Amazon Athena, Data, ETL Development, Data Science, Docker Compose, DockerData Engineer
2019 - 2021Banco Original- Migrated ten on-premises ETL processes to Google Cloud Platform using Google Cloud Storage, Cloud Functions, Google Pub/Sub, and Google BigQuery.
- Developed a system with Hive and Power BI for more than 20 financial products and around 100 campaigns monthly. This automatically gets the conversion rate of marketing campaigns, specifically email, push, and ads,.
- Created an API integration with BigQuery data to post on Facebook Marketing API and Google Ads using the Google Cloud Platform tools: BigQuery, Cloud Function, and Pub/Sub.
- Worked as a product owner and developer to create data pipelines and data modeling to support a newly acquired marketing platform, Oracle Responsys, and adapt it to the company.
- Worked with product managers to get the correct information on the transactional database and develop it ETL to a data warehouse.
Technologies: Apache Airflow, Agile, Google Cloud Platform (GCP), ETL, Google Pub/Sub, Google Cloud Functions, Google BigQuery, Facebook Marketing API, Google Ads API, ELT, Data Warehouse Design, Data Warehousing, Data Lakes, Financial Products, Financial Data, Product Owner, Oracle Responsys, Jupyter Notebook, Cloud Computing, Google Cloud Composer, Data Pipelines, Apache Hive, HDFS, Hadoop, Zeppelin, Python, SQL, Pandas, NumPy, Data Quality, Data Processing, PyCharm, Functional Programming, Data Engineering, Data Modeling, Data Architecture, Apache Spark, Google Cloud Storage, Data Marts, Banking & Finance, Looker, Data, ETL Development, Data ScienceBusiness Intelligence Analyst Jr
2017 - 2019Banco Original- Supported the development and structure of the data lake environment on HDFS.
- Developed fifteen ETL processes at data sources located on the data lake (HDFS). Delivered them on Hive for analytics purposes to business users.
- Worked with marketing managers to develop data-driven sales strategies through in-depth analysis of customers' behavior.
- Developed product performance dashboards to be accessed by the sales and product teams.
- Created more than 20 customer audiences for marketing campaign journeys according to product analytics rules.
- Improved the performance of a customer service chatbot through analytics on JSON files.
- Worked with business and product teams to disseminate analytics best practices, improve their query performances, and get the correct rules to achieve their analysis goals.
Technologies: HDFS, Hadoop, Zeppelin, Apache Hive, ETL, Chatbots, Data Analytics, Business Intelligence (BI), ELT, Financial Data Analytics, Financial Data, Financial Products, Marketing Automation, Jupyter Notebook, Microsoft Power BI, Marketing Campaign Design, Data Auditing, Sales Strategy, Data Lakes, Data Warehouse Design, SQL, Python, Data Quality, Data Processing, Data Engineering, Data Modeling, Data Warehousing, Data Marts, Banking & Finance, Data, ETL Development, Data ScienceBusiness Intelligence Intern
2016 - 2017Banco Original- Created reports to analyze the rentability and acquisition of customers.
- Worked with product managers to audit ten financial products on the analytics databases and compare them with the transactional system.
- Developed around twenty ad-hoc queries with SQL and SAS using the SAS Guide software.
- Created around 25 top campaigns through several fonts of data (e.g., customers that delay the credit card bill).
- Developed the data dictionary for the databases of ten financial products.
- Supporting the survey and control of informational data gaps.
- Strong knowledge of flux, rules, and specifications of bank products, like credit cards, loans, and overdraft to audit the bases and create campaigns and reports.
Technologies: SQL, SAS Enterprise Guide, Dashboards, Excel 365, Marketing Automation, Financial Products, Data Analytics, Business Intelligence (BI), Financial Data, Financial Data Analytics, Banking & Finance, Data