Michał Porębski, Developer in Choszczno, Poland
Michał is available for hire
Hire Michał

Michał Porębski

Verified Expert  in Engineering

Data Engineer and Software Developer

Location
Choszczno, Poland
Toptal Member Since
January 3, 2022

Michał is an experienced ETL, data warehousing, reporting, and data visualization specialist with an academic background in statistics. His past projects include design and implementation of entire reporting and data warehousing setups, design and creation of reports and dashboards, data pipelines, and query optimizations. He has experience in all stages of data management—extraction, transformation, and analysis.

Portfolio

Darwill, Inc.
SQL, Data Engineering, Tableau, ETL, Python, Data Analysis, Redshift...
Freelance
Python 3, SQL, Redshift, Google BigQuery, BigQuery, Data Warehousing...
Freelance
Python 3, Snowflake, Amazon Web Services (AWS), PostgreSQL, Redshift...

Experience

Availability

Part-time

Preferred Environment

Python, SQL, PostgreSQL, Snowflake, Redshift, Tableau, DataGrip, PyCharm, Business Intelligence (BI), Data Engineering

The most amazing...

...thing I've accomplished is developing a Python ELT framework handling thousands of files from multiple sources (250+ billion rows, 2.5+ TB of data processed).

Work Experience

Senior Data Engineer

2022 - 2023
Darwill, Inc.
  • Designed and developed Python packages serving as API wrappers for address parsing, standardization, and geocoding.
  • Created and developed internal Python packages to streamline common data engineering and data science tasks (Pandas DataFrame manipulation and transformations, loading data into Redshift, Lambda deployment, and parallel data processing).
  • Built Lambda-based Redshift external functions, simplifying data pipelines and going from a day of work for a data engineer to seconds of on-the-fly processing.
  • Introduced software engineering best practices (source control, usage of virtual environments, packaging code, credentials management, logging, and documentation) and mentored junior team members.
  • Audited existing SQL data pipelines and Python analytical scripts.
Technologies: SQL, Data Engineering, Tableau, ETL, Python, Data Analysis, Redshift, Amazon Aurora, MySQL, APIs, AWS Lambda, Amazon S3 (AWS S3), Amazon Web Services (AWS), Business Intelligence (BI), PostgreSQL, Data Architecture, DataGrip, PyCharm, Data Warehousing, Data, ETL Tools, Technical Architecture, Monitoring, Data Auditing, Agile

Senior Analytics Engineer | Data Architect

2021 - 2022
Freelance
  • Architected and developed improved extrapolation and calibration SQL pipelines dealing with tracking data affected by the GDPR and other privacy-related changes.
  • Architected and developed a Python pipeline for exporting cohort data from the Adjust API and loading it into Redshift.
  • Audited and optimized a marketing attribution SQL pipeline by combining transactional data with tracking data from Google Analytics and Adjust.
  • Supported the migration of SQL pipelines from Redshift to BigQuery.
Technologies: Python 3, SQL, Redshift, Google BigQuery, BigQuery, Data Warehousing, Amazon S3 (AWS S3), Tableau, Adjust, Business Intelligence (BI), Python, PostgreSQL, Data Engineering, Data Architecture, Amazon Web Services (AWS), DataGrip, PyCharm, AWS Lambda, APIs, ETL, Data, ETL Tools, Technical Architecture, Monitoring, Data Auditing, Agile

Data Engineer | Data Architect

2019 - 2020
Freelance
  • Architected and implemented a Snowflake-based data warehouse that stored calculation results and was optimized for various downstream consumers, including analysts, mechanical engineers, and data visualization applications.
  • Built and architected a Python-based ELT framework to ingest thousands of files from multiple data sources in parallel with a focus on continuous updates and recalculations while maintaining data traceability.
  • Audited and migrated PostgreSQL-based Data Vault 2.0 data warehouse (250 billion rows and over 2.5 terabytes of data) to Snowflake.
  • Architected and developed various ETL pipelines (time series, calculation engine results, and weather data) and Tableau dashboards (data sets and data ingestion statistics).
  • Completed four weeks of Snowflake architecture and cost optimization workshops with Snowflake consultants.
  • Mentored junior members of the team and trained Tableau users.
Technologies: Python 3, Snowflake, Amazon Web Services (AWS), PostgreSQL, Redshift, Amazon Redshift Spectrum, Talend, Tableau, Data Vaults, Business Intelligence (BI), Python, SQL, Data Engineering, Data Architecture, DataGrip, PyCharm, Amazon Aurora, Amazon S3 (AWS S3), AWS Lambda, APIs, ETL, Data Warehousing, Google BigQuery, Data, ETL Tools, Technical Architecture, Monitoring, Data Auditing, Agile, Industrial Internet of Things (IIoT)

Product Manager Business Intelligence

2018 - 2018
Freelance
  • Audited, optimized, and managed data warehouse combining operational, tracking, and marketing data.
  • Audited, designed, and developed ETL pipelines for displaying marketing data.
  • Managed marketing attribution, cohort and profitability calculations, and reporting for C-level management.
  • Trained employees in self-service reports, built a company-wide KPI tree and increased data awareness within the company.
Technologies: Python 3, PostgreSQL, MySQL, Segment, Saiku, Business Intelligence (BI), Python, SQL, Snowflake, Redshift, Tableau, Data Engineering, Data Architecture, Amazon Web Services (AWS), DataGrip, PyCharm, Amazon S3 (AWS S3), APIs, ETL, Data Analysis, Data Warehousing, Data, ETL Tools, Monitoring, Data Auditing, Agile

Business Intelligence Manager

2016 - 2017
Pandata
  • Designed and implemented entire business intelligence and reporting infrastructures and solutions for all stages of data gathering, processing, and analysis.
  • Designed and developed ETL pipelines for various APIs such as AdWords, Google Analytics, BigQuery, Google Sheets, Facebook, Smartly, YouTube Analytics, Amazon S3, data crawlers, web scrapers, and data processing web apps.
  • Designed and developed data warehouses and reporting-oriented data pipelines. Reviewed and optimized SQL scripts.
  • Gathered business requirements, designed and developed multiple reports and dashboards.
  • Identified and defined actionable KPIs, analyzed performance against benchmarks, analyzed retention, customer cohorts, and customer lifetime value, and analyzed CAC and ROI of marketing campaigns.
  • Recruited, supervised, and trained members of the Business Intelligence team.
Technologies: Python 3, PostgreSQL, Redshift, BigQuery, Microsoft SQL Server, MySQL, Tableau, Business Intelligence (BI), Python, SQL, Snowflake, Data Engineering, Data Architecture, Amazon Web Services (AWS), Amazon Redshift Spectrum, DataGrip, PyCharm, Amazon Aurora, Amazon S3 (AWS S3), AWS Lambda, APIs, Data Analysis, ETL, Data Warehousing, Google BigQuery, Data, ETL Tools, Monitoring, Data Auditing, Agile, Industrial Internet of Things (IIoT)

Business Intelligence Analyst

2015 - 2016
Zalando
  • Gathered business requirements from stakeholders, managed communication with developers, and prepared implementation plans.
  • Integrated data from multiple distributed data sources such as warehouse events, delivery, and return events from logistics partners.
  • Analyzed operational data such as warehouse stock movement, productivity, delivery, and returns and created reports for the controlling department.
  • Managed ad hoc SQL requests, reports, and analyses.
Technologies: PostgreSQL, Oracle, Exasol, MicroStrategy, Business Intelligence (BI), SQL, Redshift, Tableau, APIs, Data Analysis, ETL, Data Warehousing, Data, Data Auditing, Agile

Business Intelligence Analyst

2015 - 2015
Helpling
  • Gathered business requirements, designed and developed all inside and outside reporting with 50+ stakeholders, including management, marketing, operations, and investors.
  • Integrated data from multiple distributed data sources necessary for the business reporting such as Salesforce, Google Analytics, CRM tools, and production database. Created and maintained the reporting-oriented data pipelines.
  • Migrated all reporting from Excel to Tableau, designed and developed multiple reports and dashboards, administrated Tableau Server.
  • Gathered business requirements, defined KPIs, analyzed retention, customer cohorts, and customer lifetime value. Analyzed CAC and ROI of marketing campaigns.
Technologies: PostgreSQL, BigQuery, Tableau, Business Intelligence (BI), SQL, APIs, Data Analysis, ETL, Data Warehousing, Google BigQuery, Data, Data Auditing, Agile

Associate Consultant

2014 - 2014
MicroStrategy
  • Completed six-week technical boot camp with a final score of 98% and received the Rookie of the Quarter Award.
  • Conducted business model analysis and prepared business intelligence solution recommendations.
  • Designed and developed reports and dashboards for various stakeholders and migrated reports from different BI solutions.
Technologies: MicroStrategy, Oracle, Business Intelligence (BI), SQL, PostgreSQL, Data Analysis, ETL, APIs, Data, Data Auditing

ETL Framework and Data Warehouse for Calculation Results

Migrated a Data Vault 2.0-based data warehouse with more than 250 billion rows and 2.5 TB of data to Snowflake. Additionally, I designed and developed a Python-based ELT framework to ingest incoming data consisting of thousands of files from multiple data sources in parallel. The data warehouse had to handle continuous updates of historical data while maintaining full data traceability.

Data Warehouse and Dashboards for a Data Wall

https://www.newswire.com/news/slemma-focuses-on-differance-of-being-data-driven-vs-data-informed-18957580
I designed and developed a data warehouse gathering YouTube views and engagement statistics necessary to track the performance of YouTube channels. One of the main goals of the project was to encourage data-informed decision-making processes. We achieved that by creating a data wall with dashboards.

Python API Wrapper

I designed and developed a Python API wrapper allowing the client to efficiently access a mission-critical API. The client was using a GUI based tool to parse and standardise postal addresses, requiring a lot of manual and time consuming work involving a Data Engineer. My solution offered an easy to use Python package to programatically interact with the SOAP based API and streamline work for both Data Engineers and Data Scientists.

Additionally I developed Lambda based Redshift functions serving as extensions for my package, allowing the Analytics team to use the API on-the-fly via their SQL scripts.

Languages

Python, SQL, Snowflake, Python 3

Tools

Tableau, Amazon Redshift Spectrum, DataGrip, PyCharm, BigQuery, Saiku

Paradigms

Business Intelligence (BI), ETL, Agile

Platforms

Amazon Web Services (AWS), AWS Lambda, Oracle, Talend

Storage

PostgreSQL, Redshift, MySQL, Microsoft SQL Server, Amazon Aurora, Amazon S3 (AWS S3), Exasol

Other

Data Engineering, Data Architecture, Google BigQuery, Data Warehousing, Data Vaults, Data Analysis, Data, ETL Tools, Econometrics, Decision Analysis, Decision Trees, Technical Architecture, Data Auditing, Segment, MicroStrategy, Slemma, Adjust, APIs, SOAP, Monitoring, Industrial Internet of Things (IIoT)

2012 - 2014

Master's Degree in Quantitative Methods in Economics and Information Systems

Warsaw School of Economics (SGH) - Warsaw, Poland

2009 - 2012

Bachelor's Degree in Quantitative Methods in Economics and Information Systems

Warsaw School of Economics (SGH) - Warsaw, Poland

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring