
Robert Pohnke
Verified Expert in Engineering
Data Architecture Developer
Warsaw, Poland
Toptal member since August 12, 2022
Robert is a seasoned Cloud architect focused on Azure and AWS architecture and development. Throughout the years, he has helped clients from banking, healthcare, energy, automotive, and marketing gather and leverage their data. Robert has successfully led numerous data and Delta Lake implementations.
Portfolio
Experience
- ETL - 10 years
- Data Engineering - 10 years
- Big Data - 10 years
- Big Data Architecture - 8 years
- Azure - 6 years
- Cloud - 6 years
- Cloud Architecture - 5 years
- Databricks - 5 years
Availability
Preferred Environment
Azure, Databricks, Delta Lake
The most amazing...
...experience I've led was an Azure Data Lake design and implementation from the ground up and oversaw a global release.
Work Experience
Azure Architect
Sodexo
- Designed and implemented a new framework for generic batch ingestion as a part of a global data platform project.
- Designed and implemented data catalog capabilities with Databricks and Dremio.
- Implemented and designed a streaming application with Databricks and myDevices.
Azure Architect
DHL International
- Designed and implemented a new framework for data ingestion with Databricks, Azure Data Factory, Azure Synapse, and Azure DevOps.
- Implemented config-driven generic ETL pipelines for SFTP, Teradata, and SAP sources.
- Replaced legacy on-prem MapR ingestion pipelines with an Azure-based solution, significantly improving their performance, stability, and cost.
Freelance Solutions Architect
DasLab
- Gathered functional and non-functional requirements to create the next-gen healthcare platform based on AWS and FHIR.
- Designed the platform architecture and application components to serve high volumes of data underpinned by Kubernetes, IBM FHIR Server and RDS.
- Evaluated multiple solutions through proof of concept (POC).
- Used Amazon EKS, AWS ECR, AWS S3, AWS RDS, Amazon SQS, AWS Push Notification Service (AWS SNS), AWS EventBridge, Amazon Virtual Private Cloud (VPC), AWS Fargate, Amazon EC2, FHIR, and Keycloak.
Freelance Big Data Consultant
Publicis Groupe
- Evaluated the performance problems with Azure Databricks and tuned the cluster settings to accelerate large file transformations.
- Implemented historical data preprocessing procedures using Azure Functions that transformed over 6TB of data.
- Introduced the Databricks SQL Analytics cluster as a data warehouse layer for business intelligence (BI).
Freelance Big Data Consultant
Allianz
- Led the transformation of a harmonized data layer into a consumption layer for a Fleet applications project.
- Implemented data model mappings and ETL pipelines in Azure Data Factory and deployed the portal back end to Azure Kubernetes Service (AKS).
- Created data models in SqlDBM to capture insurance data.
Freelance Big Data Architect
Schaeffler
- Led the design and implementation of a data lake based on Microsoft Azure to ingest data from cross-organizational data sources. Performed SAP data migration to Azure Synapse.
- Industrialized Python and R machine learning (ML) models and deployed them to Spark, VM, Azure Kubernetes Service (AKS), and Azure Batch.
- Introduced guidelines for data scientists working in Python and R. Set up Jupyter notebooks, created with Flask, Scikit-learn, and RStudio Shiny environments.
- Ingested sensor stream data from production plants through EventHub and Spark Streaming.
Freelance Big Data Consultant
Essity
- Created CI/CD pipelines to enable ML notebook versioning and deployment in Databricks and an ML workspace.
- Improved an ML development workflow by incorporating Azure ML Studio.
- Participated in migration efforts from a data lake into a new version based on ADLS.
- Implemented ETL pipelines in Azure Data Factory for the shared Delta Lake.
Freelance DevOps Consultant
Adidas
- Designed and implemented a framework for testing. Rolled out Schemas updates to Exasol tables across multiple projects and environments.
- Created dockerized Exasol environments and CI/CD pipelines using Jenkins.
- Introduced guidelines for database developers working in Python.
- Participated in architecture discussions and code reviews.
Freelance Big Data Architect
BNP Paribas
- Involved in architecture and implementation of centralized on-premise data lakes for all enterprise-sensitive data in the organization.
- Developed playbooks with Ansible for Hadoop cluster creation with Apache Ambari.
- Created a Bash Script for automation and task scheduling and configured and secured Hortonworks Data Platform (HDP) clusters.
Freelance Big Data Consultant
Nordea
- Joined a team responsible for architecture and implementation of data lake storing transaction and account history.
- Developed analytics Spark jobs that produced reports to a suite of mainframes.
- Developed ETL pipelines in Spark, Flume, and Oozie to ingest data into the data lake.
- Configured and secured clusters and implemented business-critical and resilient Oozie workflows in production.
Freelance Big Data Architect
E.ON
- Orchestrated architecture and implementation of ETL and ML pipelines in Spark.
- Ingested sensor data from power plant assets via Kafka into OpenTSDB.
- Performed data quality checks and missing data imputation.
- Led distributed training and serving of ML models to generate real-time forecasts for power output for a wind park.
Experience
TransTracker
Education
Bachelor's Degree in Econometrics and IT
University of Warsaw - Warsaw, Poland
Bachelor's Degree in Computer Science
University of Warsaw - Warsaw, Poland
Skills
Libraries/APIs
PySpark, Spark Streaming
Tools
Azure Kubernetes Service (AKS), Synapse, Amazon EKS, Amazon Elastic Container Registry (ECR), AWS Fargate, Azure Machine Learning, Jenkins, Ansible, Oozie, Flume, Cloudera, Dremio
Languages
SQL, Bash, Python, Java, Scala
Paradigms
ETL, DevOps, HL7 FHIR Standard, Fast Healthcare Interoperability Resources (FHIR)
Platforms
Azure, Databricks, Docker, Apache Kafka, Azure Synapse, Kubernetes, Amazon Web Services (AWS), Azure Functions, Android, Hortonworks Data Platform (HDP), Azure SQL Data Warehouse, Google Cloud Platform (GCP), Dedicated SQL Pool (formerly SQL DW)
Storage
Data Lakes, Data Pipelines, Databases, Apache Hive, Database Architecture, Data Integration, PostgreSQL, Exasol, HBase, Azure SQL
Frameworks
Spark, Hadoop, Spark Structured Streaming
Other
Cloud, Cloud Architecture, Data Engineering, Big Data Architecture, Big Data, Delta Lake, Azure Data Factory (ADF), Azure Data Lake, Azure Databricks, Microsoft Azure, Data Migration, Data Warehousing, Data Architecture, Solution Architecture, Data, Programming, Data Warehouse Design, Reporting, Streaming Data, Data Governance, APIs, Machine Learning Operations (MLOps), Amazon RDS, OpenTSDB, AWS Cloud Architecture
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring