Tomasz Zielański, Developer in Tarnowskie Gory, Poland
Tomasz is available for hire
Hire Tomasz

Tomasz Zielański

Verified Expert  in Engineering

Data Engineer and Database Developer

Tarnowskie Gory, Poland

Toptal member since January 28, 2022

Bio

Tomasz is a data engineer with over eight years of industry experience. He specializes in engineering reliable data pipelines and building cloud data warehouses and cloud data lakes. He has significant experience working in the fintech and healthcare industries. Tomasz excels at using data to solve business challenges and is passionate about the data mesh approach for organizing the data lifecycle in an enterprise.

Portfolio

Garner Health Technology
Python 3, Kubernetes, Snowflake, Data Build Tool (dbt), AWS Batch...
BlueSoft
Python 3, Scikit-learn, FastAPI, Artificial Intelligence (AI), Python, Git
Stermedia
Python 3, Scikit-learn, Pandas, Python, Machine Learning, SQL, Statistics...

Experience

  • SQL - 8 years
  • Python 3 - 8 years
  • Data Engineering - 6 years
  • Python - 6 years
  • Amazon Web Services (AWS) - 3 years
  • Snowflake - 2 years
  • Artificial Intelligence (AI) - 2 years
  • Kubernetes - 2 years

Availability

Part-time

Preferred Environment

Python 3, Google Cloud, Amazon Web Services (AWS), SQL, Kubernetes, Snowflake, Data Build Tool (dbt), Argo Workflows

The most amazing...

...project I've worked on was building a data lake from scratch for the very first fintech in Europe, operating entirely on Google Cloud.

Work Experience

Senior Data Engineer

2022 - PRESENT
Garner Health Technology
  • Developed an ETL pipeline for over 80GB nested JSON information ingestion into a normalized data model.
  • Created a data synchronization pipeline between various data warehouse sources and the marketing platform, enabling the sending of well-informed, fine-tuned communication campaigns to users.
  • Built a data pipeline leveraging large language models for information enrichment.
  • Introduced a new standard for orchestrated, automated batch jobs in critical data-centered business workflows.
  • Refactored data lake architecture to collaborate with reporting teams in the data mesh paradigm.
Technologies: Python 3, Kubernetes, Snowflake, Data Build Tool (dbt), AWS Batch, Amazon S3 (AWS S3), Amazon Web Services (AWS), Argo Workflows, Terraform, GitLab CI/CD, PostgreSQL, Amazon EC2, Artificial Intelligence (AI), SQL, Python, Data Lakes, Data Pipelines, Data Engineering, ETL, Git

Data Scientist

2022 - 2022
BlueSoft
  • Developed a healthcare data classification application for one of the major pharmaceutical corporations in Europe.
  • Analyzed infrastructure updates' impact on the classifier's behavior and output.
  • Added new features for data visualization, significantly increasing visibility into the classification results.
Technologies: Python 3, Scikit-learn, FastAPI, Artificial Intelligence (AI), Python, Git

Data Scientist

2021 - 2022
Stermedia
  • Developed a machine learning model for startup classification and integrated it with a web application.
  • Conducted technical consultations for prospective clients.
  • Prepared work estimations for machine learning projects for prospective clients.
Technologies: Python 3, Scikit-learn, Pandas, Python, Machine Learning, SQL, Statistics, Data Cleansing, Data Transformation, Jupyter, GitHub, Artificial Intelligence (AI), Git

Data Engineer

2021 - 2021
Bluesoft
  • Developed a cross-account access solution for a distributed AWS Data Lake architecture.
  • Optimized an Apache Spark pipeline to speed up the processing of 1GB+ datasets.
  • Added new functions to an on-demand data pipeline created on AWS EMR.
Technologies: Python 3, Amazon Web Services (AWS), Apache Airflow, Apache Spark, Amazon Elastic MapReduce (EMR), AWS Glue, Amazon Athena, Amazon S3 (AWS S3), Amazon EC2, Python, Data Lakes, Data Pipelines, Data Engineering, SQL, Data Transformation, REST APIs, APIs, Linux, ETL, GitHub, Git

Data Engineer

2018 - 2021
Vodeno
  • Took part in designing a GCP data lake from scratch, including table modeling, choosing storage technology, and modeling guidelines.
  • Implemented a streaming data pipeline solution in Java 8, Kotlin, and Google Cloud Dataflow (Apache Beam).
  • Designed pipelines for BI reporting, including modeling data structures, implementing data pipelines, and creating dashboards in Google Data Studio.
  • Developed a propensity-to-buy machine learning model.
Technologies: Python 3, BigQuery, BigTable, Google Cloud, Java 8, Kotlin, Cloud Dataflow, Google Data Studio, Scikit-learn, Python, Java, Machine Learning, Google Cloud Platform (GCP), Data Lakes, Data Pipelines, Apache Beam, BI Reporting, Dashboards, Data Structures, Data Engineering, SQL, Data Warehousing, Predictive Modeling, Dashboard Design, Statistics, Data Cleansing, Data Transformation, Jupyter, Linux, ETL, GitHub, Artificial Intelligence (AI), Git

ETL Developer

2016 - 2018
ING Bank Śląski
  • Built multiple data pipelines in IBM Infosphere DataStage.
  • Designed Snowflake structures for a data warehouse on Oracle.
  • Maintained an Oracle data warehouse and solved bugs in collaboration with business teams.
Technologies: Oracle, IBM InfoSphere (DataStage), erwin Data Modeler, ETL, Data Pipelines, Data Warehousing, SQL, Data Structures, Data Engineering, Data Cleansing, Data Transformation, Jupyter, Linux, GitHub, Git

Experience

Vodeno Cloud Platform for Banking

https://www.vodeno.com/#platform
A bank-in-the-box platform based on Google Cloud and the very first bank in Europe to be hosted solely on GCP. I took part in creating the data lake services from scratch until they were fully operational in production. My focus areas included architecture design, choice of storage technology, implementation of the streaming ETL pipeline framework, reporting dashboard design, and predictive modeling.

Education

2011 - 2016

Master's Degree in Automatics and Robotics

Silesian University of Technology - Gliwice, Poland

Certifications

NOVEMBER 2021 - PRESENT

Deep Learning

DeepLearning.ai

APRIL 2021 - PRESENT

AWS Cloud Practitioner Essentials

Amazon Web Services

MARCH 2020 - PRESENT

Machine Learning

Stanford University

SEPTEMBER 2016 - PRESENT

Certificate of Proficiency in English

University of Cambridge

Skills

Libraries/APIs

Scikit-learn, Pandas, REST APIs

Tools

Jupyter, Git, BigQuery, GitHub, Terraform, Apache Airflow, Amazon Elastic MapReduce (EMR), AWS Glue, Amazon Athena, Cloud Dataflow, IBM InfoSphere (DataStage), Apache Beam, AWS Batch, GitLab CI/CD

Languages

Python 3, SQL, Python, Snowflake, Java 8, Kotlin, Java

Paradigms

ETL

Storage

Amazon S3 (AWS S3), Data Pipelines, BigTable, Data Lakes, PostgreSQL, Google Cloud

Platforms

Amazon Web Services (AWS), Amazon EC2, Linux, Kubernetes, Oracle, Google Cloud Platform (GCP)

Frameworks

Apache Spark

Other

Data Engineering, Data Transformation, Data Cleansing, Argo Workflows, Software Development, Machine Learning, Google Data Studio, Artificial Intelligence (AI), Data Structures, APIs, Data Build Tool (dbt), erwin Data Modeler, Deep Learning, BI Reporting, Dashboards, Data Warehousing, Architecture, Dashboard Design, Predictive Modeling, Statistics, FastAPI

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring