Jakub Kiełbasiewicz, Data Warehousing Developer in Wrocław, Poland
Jakub Kiełbasiewicz

Data Warehousing Developer in Wrocław, Poland

Member since April 11, 2019
Jakub is a skilled data engineer looking forward to his next challenge. He loves working with databases, NoSQL, and distributed systems - especially with the ETL part.
Jakub is now available for hire

Portfolio

  • Roche
    Azure, Python, SQL, MSSQL, Databricks, Spark, Streaming
  • Aberdeen
    SQL Server, SSIS, C#, Python, AWS, MSSQL, TSQL, PySpark, Spark
  • Ryanair
    SQL, SSIS, C#, SSAS, MDX, DAX, MSSQL, T-SQL, SQL Server

Experience

Location

Wrocław, Poland

Availability

Part-time

Preferred Environment

Python, Spark, Big Data

The most amazing...

...project I've contributed to was the IoT Data Quality project, that allowed the business to make better decisions.

Employment

  • Cloud Data Engineer

    2020 - PRESENT
    Roche
    • Created data pipelines for IoT solutions in Python, PySpark, Azure Data Factory, and Azure Synapse.
    • Validated and migrated on-prem Python to Azure-specific offerings.
    • Documented the process of ingesting and transforming data.
    Technologies: Azure, Python, SQL, MSSQL, Databricks, Spark, Streaming
  • Senior Data Engineer/Database Lead

    2019 - 2020
    Aberdeen
    • Created new ways to clean up the data, using web scraping and algorithms (Python, PySpark).
    • Designed master data.
    • Designed and implemented ETL processes.
    • Handled SQL Server administration including data partitioning, index maintenance, security, and backup policies.
    • Automated QA testing in Python.
    • Maintained the Elasticsearch cluster.
    • Successfully migrated whole product to AWS (re-designing the data product - using AWS EMR, s3, RDS, Lambda, Athena, Glue).
    • Integrated heterogeneous data sources.
    • Implemented business logic in Python.
    Technologies: SQL Server, SSIS, C#, Python, AWS, MSSQL, TSQL, PySpark, Spark
  • Business Intelligence Developer

    2019 - 2019
    Ryanair
    • Prepared PoC solutions in Azure(Databricks, ADF), graph databases, Power BI - what allowed BI team to.
    • manage relationships between ETL processes easier.
    • Created ETL processes, which allowed company to integrate data during LaudaMotion takeover and use Data Science algorithms to measure fuel consumption - using hadoop, Impala, python scripts for 3rd party API integration.
    • Modeled new data marts focused on integrating big data solutions based on web analytics and marketing with consumer-related Data Warehouse.
    • Developed Data Quality checks.
    Technologies: SQL, SSIS, C#, SSAS, MDX, DAX, MSSQL, T-SQL, SQL Server
  • ETL & Automation Analyst

    2018 - 2018
    Ryanair
    • Query performance tuning - highly improved ETL processes performance required for reporting solutions in Marketing team.
    • Data integration automation (R, Python, SQL, MDX) - eliminated human factor from ETL process completely (full automation).
    • Resolved Tableau issues.
    • Automated internal processes to get the data.
    • Set up query performance tuning.
    Technologies: VBA, SQL, Python, R, Java, MDX, Tableau, PowerBI, TSQL, MSSQL, SQL Server
  • Business Intelligence Consultant

    2017 - 2018
    Tech Data
    • Resolved ETL issues.
    • Created new ETL data flows.
    • Developed advanced TSQL solutions.
    • Automated tasks using Powershell/TSQL/SSIS.
    • Designed and maintained reporting environment.
    • Prepared reports and dashboards for high-level management.
    Technologies: SQL, TSQL, SSAS, SSIS, VBA, Python, Powershell, R, C#, MSSQL

Experience

  • IoT Data Quality Dashboard (Development)

    Developed Data Quality solution for IoT implementation in manufacturing company, which allowed assessing the data quality of sensors data - re-modeled database, created metamodel for the database (Azure SQL), and designed and developed Power BI dashboard (DAX + Python). The project involved the implementation of machine learning algorithms - like Isolation Forest. Along with the dashboard, I have proposed a new solution for data quality management in companies - the iterative methodology and step-by-step process of how to work with IoT data quality. The project is in the improvement phase and I hope to present it at various conferences.

  • Web App for Property Management - Rent/Sell/Buy (Development)

    I've taken part in an academic project to create a web-based application for property management, which allowed people to add offers, contact owners, etc.

    I was responsible for infrastructure (Azure stack) and the database.

  • Digital Goods Warehouse and Bookkeeping Web App (Development)

    I'm taking part in a big part-academic project which has entered academic projects competition. The application will be responsible for managing digital goods (like videos, pictures, etc.) and will allow people to keep their goods in one place and track all bookkeeping related to these goods.

    I'm responsible for infrastructure (Azure stack) and database - that's a huge challenge because every user has to have separate data storage solution and separate database, which has to be created dynamically when a user is creating a new account.

  • Tutoring (Development)

    I help people every day with their data-related troubles. It doesn't matter if it is for some academic project, master/bachelor thesis or solution for a company, every project is different and requires a different approach.

Skills

  • Languages

    SQL, T-SQL, R, Python, Java
  • Tools

    Synapse, BigQuery, Azure HDInsight, Microsoft Power BI, Apache Impala, Apache Airflow, Tableau, Git, Jenkins, TFS
  • Paradigms

    Database Design, ETL, Azure DevOps
  • Platforms

    Databricks, Azure, Amazon Web Services (AWS), Google Cloud Platform (GCP), Pentaho, Talend
  • Storage

    Databases, SQL Server Integration Services (SSIS), Redshift, Database Modeling, PL/SQL, MySQL, SQL Server Analysis Services (SSAS), Elasticsearch, PostgreSQL, Azure SQL Databases, Azure Cosmos DB, Google Cloud, NoSQL
  • Other

    Azure Data Factory, SQL Server, MSSQL, Data Engineering, Data Warehousing, Google BigQuery, Query Optimization, AWS, Apache Cassandra, Azure Data Lake, Azure Data Lake Analytics, Tableau Server, MicroStrategy, SOAP
  • Frameworks

    Hadoop, Apache Spark
  • Libraries/APIs

    PySpark, Pandas, NumPy, Matplotlib

Education

  • Bachelor's degree in Computer Science
    2016 - 2020
    Wroclaw University of Science and Technology - Wroclaw, Poland
  • Bachelor's degree in Economics
    2014 - 2017
    University of Wroclaw - Wroclaw, Poland

Certifications

  • Azure Data Engineer
    JULY 2020 - PRESENT
    Microsoft
  • AWS Cloud Practitioner
    MAY 2020 - PRESENT
    AWS
  • Associate Cloud Engineer
    DECEMBER 2019 - DECEMBER 2021
    Google
  • Azure Administrator Associate
    FEBRUARY 2019 - PRESENT
    Microsoft
  • MCSA: SQL 2016 BI Development
    AUGUST 2018 - PRESENT
    Microsoft
  • MCSE: Data Management and Analytics
    JULY 2018 - PRESENT
    Microsoft
  • MCSA: SQL 2016 Database Development
    JUNE 2018 - PRESENT
    Microsoft
  • MCSA: SQL 2016 Database Administration
    SEPTEMBER 2017 - PRESENT
    Microsoft

To view more profiles

Join Toptal
Share it with others