Robert is available for hire

Robert Pohnke

Verified Expert in Engineering

Data Architecture Developer

Location

Warsaw, Poland

Toptal Member Since

August 12, 2022

Robert is a seasoned Cloud architect focused on Azure and AWS architecture and development. Throughout the years, he has helped clients from banking, healthcare, energy, automotive, and marketing gather and leverage their data. Robert has successfully led numerous data and Delta Lake implementations.

Portfolio

Sodexo

Azure, Azure Synapse, Dremio, Databricks, Azure Data Factory, Data Engineering...

DHL International

Azure, Azure Databricks, Databricks, Azure SQL, Azure Synapse...

DasLab

Amazon Web Services (AWS), HL7 FHIR Standard...

Experience

ETL - 10 years Data Engineering - 10 years Big Data - 10 years Big Data Architecture - 8 years Azure - 6 years Cloud - 6 years Cloud Architecture - 5 years Databricks - 5 years

Availability

Full-time

Preferred Environment

Azure, Databricks, Delta Lake

The most amazing...

...experience I've led was an Azure Data Lake design and implementation from the ground up and oversaw a global release.

Work Experience

Azure Architect

2022 - 2023

Sodexo

Designed and implemented a new framework for generic batch ingestion as a part of a global data platform project.
Designed and implemented data catalog capabilities with Databricks and Dremio.
Implemented and designed a streaming application with Databricks and myDevices.

Technologies: Azure, Azure Synapse, Dremio, Databricks, Azure Data Factory, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Streaming Data, Data Governance, Data Integration, Solution Architecture, Data, Databases

Azure Architect

2022 - 2023

DHL International

Designed and implemented a new framework for data ingestion with Databricks, Azure Data Factory, Azure Synapse, and Azure DevOps.
Implemented config-driven generic ETL pipelines for SFTP, Teradata, and SAP sources.
Replaced legacy on-prem MapR ingestion pipelines with an Azure-based solution, significantly improving their performance, stability, and cost.

Technologies: Azure, Azure Databricks, Databricks, Azure SQL, Azure Synapse, Azure Data Factory, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Data Governance, Data Integration, Solution Architecture, Data, Databases

Freelance Solutions Architect

2021 - 2022

DasLab

Gathered functional and non-functional requirements to create the next-gen healthcare platform based on AWS and FHIR.
Designed the platform architecture and application components to serve high volumes of data underpinned by Kubernetes, IBM FHIR Server and RDS.
Evaluated multiple solutions through proof of concept (POC).
Used Amazon EKS, AWS ECR, AWS S3, AWS RDS, Amazon SQS, AWS Push Notification Service (AWS SNS), AWS EventBridge, Amazon Virtual Private Cloud (VPC), AWS Fargate, Amazon EC2, FHIR, and Keycloak.

Technologies: Amazon Web Services (AWS), HL7 FHIR Standard, Fast Healthcare Interoperability Resources (FHIR), Amazon EKS, Amazon Elastic Container Registry (ECR), AWS Fargate, Amazon RDS, Cloud, DevOps, Cloud Architecture, Data Engineering, Big Data Architecture, AWS Cloud Architecture, Data Warehouse Design, Data Architecture, Big Data, SQL, ETL, Data Warehousing, Data Governance, Data Integration, Solution Architecture, Data, Databases, APIs

Freelance Big Data Consultant

2021 - 2021

Publicis Groupe

Evaluated the performance problems with Azure Databricks and tuned the cluster settings to accelerate large file transformations.
Implemented historical data preprocessing procedures using Azure Functions that transformed over 6TB of data.
Introduced the Databricks SQL Analytics cluster as a data warehouse layer for business intelligence (BI).

Technologies: Big Data Architecture, Azure, Azure Functions, Azure Data Factory, Azure Data Lake, Azure Databricks, Python, Data Warehousing, Reporting, Microsoft Azure, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Pipelines, Data Governance, Data Integration, Solution Architecture, Data, Databases

Freelance Big Data Consultant

2021 - 2021

Allianz

Led the transformation of a harmonized data layer into a consumption layer for a Fleet applications project.
Implemented data model mappings and ETL pipelines in Azure Data Factory and deployed the portal back end to Azure Kubernetes Service (AKS).
Created data models in SqlDBM to capture insurance data.

Technologies: PostgreSQL, Azure, Azure Data Factory, Azure Kubernetes Service (AKS), Azure Data Lake, Cloud, Spark, DevOps, Cloud Architecture, Data Engineering, Big Data Architecture, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Data Governance, Data Integration, Kubernetes, Solution Architecture, Data, Databases, APIs

Freelance Big Data Architect

2018 - 2021

Schaeffler

Led the design and implementation of a data lake based on Microsoft Azure to ingest data from cross-organizational data sources. Performed SAP data migration to Azure Synapse.
Industrialized Python and R machine learning (ML) models and deployed them to Spark, VM, Azure Kubernetes Service (AKS), and Azure Batch.
Introduced guidelines for data scientists working in Python and R. Set up Jupyter notebooks, created with Flask, Scikit-learn, and RStudio Shiny environments.
Ingested sensor stream data from production plants through EventHub and Spark Streaming.

Technologies: Azure, Cloud Architecture, Big Data Architecture, Databricks, Azure Kubernetes Service (AKS), Azure Functions, Azure Machine Learning, PySpark, Data Lakes, Delta Lake, Azure Data Factory, Cloud, Spark, DevOps, Data Engineering, Azure Synapse, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Spark Structured Streaming, Data Migration, Reporting, Data Architecture, Big Data, SQL, ETL, Data Warehousing, Data Pipelines, Streaming Data, Data Governance, Data Integration, Kubernetes, Solution Architecture, Data, Databases, APIs

Freelance Big Data Consultant

2020 - 2020

Essity

Created CI/CD pipelines to enable ML notebook versioning and deployment in Databricks and an ML workspace.
Improved an ML development workflow by incorporating Azure ML Studio.
Participated in migration efforts from a data lake into a new version based on ADLS.
Implemented ETL pipelines in Azure Data Factory for the shared Delta Lake.

Technologies: Azure, Azure Data Factory, Azure Databricks, Delta Lake, Azure Machine Learning, Python, PySpark, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Data Governance, Data Integration, Solution Architecture, Data, Databases

Freelance DevOps Consultant

2019 - 2019

Adidas

Designed and implemented a framework for testing. Rolled out Schemas updates to Exasol tables across multiple projects and environments.
Created dockerized Exasol environments and CI/CD pipelines using Jenkins.
Introduced guidelines for database developers working in Python.
Participated in architecture discussions and code reviews.

Technologies: Python, Jenkins, Exasol, Docker, Bash, Data Architecture, Big Data, SQL, ETL, Data Warehousing, Data Integration, Data, Databases, APIs

Freelance Big Data Architect

2017 - 2018

BNP Paribas

Involved in architecture and implementation of centralized on-premise data lakes for all enterprise-sensitive data in the organization.
Developed playbooks with Ansible for Hadoop cluster creation with Apache Ambari.
Created a Bash Script for automation and task scheduling and configured and secured Hortonworks Data Platform (HDP) clusters.

Technologies: Hortonworks Data Platform (HDP), Spark, Ansible, Apache Kafka, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Data Governance, Data Integration, Solution Architecture, Data, Databases, APIs

Freelance Big Data Consultant

2017 - 2017

Nordea

Joined a team responsible for architecture and implementation of data lake storing transaction and account history.
Developed analytics Spark jobs that produced reports to a suite of mainframes.
Developed ETL pipelines in Spark, Flume, and Oozie to ingest data into the data lake.
Configured and secured clusters and implemented business-critical and resilient Oozie workflows in production.

Technologies: Apache Kafka, Oozie, Flume, Cloudera, Scala, HBase, Apache Hive, Bash, Spark, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Lakes, Data Warehousing, Data Pipelines, Streaming Data, Data Governance, Data Integration, Solution Architecture, Data, Databases

Freelance Big Data Architect

2016 - 2017

E.ON

Orchestrated architecture and implementation of ETL and ML pipelines in Spark.
Ingested sensor data from power plant assets via Kafka into OpenTSDB.
Performed data quality checks and missing data imputation.
Led distributed training and serving of ML models to generate real-time forecasts for power output for a wind park.

Technologies: Cloudera, Spark, Spark Streaming, Apache Kafka, OpenTSDB, Scala, Python, Data Engineering, Data Architecture, Big Data, SQL, ETL, Data Pipelines, Streaming Data, Data Integration, Solution Architecture, Data, Databases

Experience

TransTracker

Developed a crowd-sourcing mobile app for tracking public transport delays based on Android, PostgreSQL, and Java for the back end and JavaScript with Google Maps API for the front end. I set up the initial version focused on public railway and bus transportation.

Skillset

Languages

SQL, Bash, Python, Java, Scala

Paradigms

ETL, DevOps, HL7 FHIR Standard, Fast Healthcare Interoperability Resources (FHIR)

Platforms

Azure, Databricks, Docker, Apache Kafka, Azure Synapse, Kubernetes, Amazon Web Services (AWS), Azure Functions, Android, Hortonworks Data Platform (HDP), Azure SQL Data Warehouse, Google Cloud Platform (GCP), Dedicated SQL Pool (formerly SQL DW)

Storage

Data Lakes, Data Pipelines, Databases, Apache Hive, Database Architecture, Data Integration, PostgreSQL, Exasol, HBase, Azure SQL

Other

Cloud, Cloud Architecture, Data Engineering, Big Data Architecture, Big Data, Delta Lake, Azure Data Factory, Azure Data Lake, Azure Databricks, Microsoft Azure, Data Migration, Data Warehousing, Data Architecture, Solution Architecture, Data, Programming, Data Warehouse Design, Reporting, Streaming Data, Data Governance, APIs, Machine Learning Operations (MLOps), Amazon RDS, OpenTSDB, AWS Cloud Architecture, Dremio

Frameworks

Spark, Hadoop, Spark Structured Streaming

Libraries/APIs

PySpark, Spark Streaming

Tools

Azure Kubernetes Service (AKS), Synapse, Amazon EKS, Amazon Elastic Container Registry (ECR), AWS Fargate, Azure Machine Learning, Jenkins, Ansible, Oozie, Flume, Cloudera

Education

2008 - 2014

Bachelor's Degree in Econometrics and IT

University of Warsaw - Warsaw, Poland

2008 - 2014

Bachelor's Degree in Computer Science

University of Warsaw - Warsaw, Poland

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring