Marc is available for hire

Marc Matt

Verified Expert in Engineering

Data Engineer and Developer

Location

Hamburg, Germany

Toptal Member Since

January 5, 2021

Marc is a data engineer with a passion for data and 15+ years of experience in leading teams and building data platforms focusing on the information technology, real estate, and services industries. He created a Python-based AVRO schema generator that makes parts of a scheme reusable. Marc excels with automation, integrations, analysis, the building of models, statistics, big data, CI/CD pipelines, and data modeling.

Portfolio

RTL Deutschland GmbH

Python 3, Kubernetes, Argo CD, FastAPI, GitLab CI/CD, BigQuery

Bold Metrics Inc.

SQL, Tableau, Python, Data Analysis, Data Build Tool (dbt), Apache Airflow...

MediaMarktSaturn Retail Group

Python 3, Google Cloud, Google Kubernetes Engine (GKE), Apache NiFi...

Experience

SQL - 15 years Tableau - 10 years Python - 7 years Docker - 3 years TensorFlow - 3 years BigQuery - 3 years Apache Beam - 3 years Apache Airflow - 3 years

Availability

Part-time

Preferred Environment

Apache Airflow, Tableau Server, Tableau, SQL, Pandas, Python, Apache Beam, Git, Linux

The most amazing...

...app I've developed provides pose estimation data in real-time to help optimize customer fitness goals.

Work Experience

Data Engineer

2024 - PRESENT

RTL Deutschland GmbH

Set up AlloyDB to serve media consumption for recommendations in real time.
Optimized the recommender ranking for the streaming platform.
Built microservices for serving customer recommendations in real time.

Technologies: Python 3, Kubernetes, Argo CD, FastAPI, GitLab CI/CD, BigQuery

Senior Data Analyst

2023 - 2023

Bold Metrics Inc.

Created a template for ad hoc reporting for all clients.
Designed and implemented streaming data entry into the data warehouse using Amazon Kinesis, Lambda, and Python.
Optimized and standardized transformations in the Redshift data warehouse.

Technologies: SQL, Tableau, Python, Data Analysis, Data Build Tool (dbt), Apache Airflow, Amazon Kinesis, AWS Lambda, Serverless Framework

Data Engineer

2022 - 2022

MediaMarktSaturn Retail Group

Established a supply chain monitoring system for the national distribution centers.
Implemented APIs to all logistic services providers and transformed them for use in companywide reporting.
Set up a real-time order tracking system using Apache NiFi on GKE.

Technologies: Python 3, Google Cloud, Google Kubernetes Engine (GKE), Apache NiFi, Google BigQuery, SQL, Data Build Tool (dbt), Docker, Apache Airflow, Apache Beam, Google Data Studio, Database Schema Design, Data Management, Terraform, Google Cloud Platform (GCP), Google Cloud Functions, Cloud Run, Cloud Tasks, Node.js, APIs, Serverless, Data Lakes, Data Visualization, Kubernetes, Scaling, Dashboards, Data Wrangling, Azure Databricks, Database Architecture, ETL Tools

Cloud Data Engineer and Architect

2021 - 2022

Spin (Tier Mobility) - Main

Designed and established an MLOps workflow with Google Vertex AI.
Operationalized ML models for real-time use cases.
Prepared the migration of DWH from BigQuery to Snowflake.
Built a tool for operational support of traffic violation incidents.

Technologies: SQL, ETL, Cloud Architecture, Google Cloud Platform (GCP), Big Data, Architecture, Python, Snowflake, Hadoop, REST APIs, Apache Airflow, Git, DevOps, Microservices, Google BigQuery, Big Data Architecture, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Security, Data Warehousing, Data Warehouse Design, Apache Avro, Kubeflow, Fivetran, Database Schema Design, Data Management, Terraform, Google Cloud Functions, Cloud Run, APIs, Serverless, Data Lakes, Kubernetes, Scaling, Data Wrangling, Database Architecture, ETL Tools

ETL Engineer

2021 - 2021

Food Marketing Company

Parsed JSON data in Talend and loaded it into Redshift.
Integrated data from web APIs with Talend into Redshift.
Transformed customer data using Talend and loaded it into Salesforce.

Technologies: Talend, JSON, Amazon Redshift Spectrum, Redshift, APIs, Data Wrangling, ETL Tools

Data Engineer

2021 - 2021

Janus

Translated legacy ETL pipelines to scalable AWS Glue jobs.
Automated resource deployment using AWS CloudFormation.
Designed and built the framework in PySpark to make adding future pipelines easier.

Technologies: AWS Glue, Spark, SQL, Amazon Aurora, Python, Database Schema Design, Data Management, Serverless, Apache Spark, PySpark, Scaling, Data Wrangling, Database Architecture, ETL Tools, AWS IAM

Senior Data Engineer

2021 - 2021

Emma

Designed a new data entry API for the data platform to enable streaming analytics.
Set up the binlog streaming process and parsing of events in real time using Kinesis, Lambda, and Kinesis Data Firehose.
Optimized the data load in Redshift by analyzing queries and tables to add optimized sort and distkeys.

Technologies: Python, Amazon Kinesis, Amazon Web Services (AWS), Redshift, Amazon Redshift Spectrum, Matillion ETL for Redshift, AWS Lambda, Parquet, AWS Fargate, Docker, Databases, Database Schema Design, Data Management, Terraform, APIs, Serverless, Kubernetes, Scaling, Data Wrangling, Database Architecture, ETL Tools, Amazon Elastic MapReduce (EMR), Amazon EKS, AWS IAM

Data Specialist

2020 - 2021

Ear-Reality GmbH

Developed a data lake based on Kinesis and Athena, including embedded reporting in Metabase.
Shifted a production system to a serverless scalable architecture.
Automated load testing of an application using Python and Locust.io.

Technologies: Amazon Web Services (AWS), SQL, Amazon Kinesis, Amazon Athena, AWS Elastic Beanstalk, Docker, Python, AWS CloudFormation, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, Terraform, APIs, Data Visualization, Dashboards, Data Wrangling

Senior Data Engineer

2018 - 2020

Engel & Völkers

Designed and built a data platform, including tool selection and data modeling.
Built a TensorFlow model to predict property values in a real-time environment.
Implemented CI/CD pipelines to automatically deploy all features of the data platform.

Technologies: Jenkins, SQL, Tableau, BigQuery, Apache Beam, Apache Airflow, TensorFlow, Google Kubernetes Engine (GKE), Docker, Python, Data Engineering, Data Architecture, Data Analysis, NoSQL, Google BigQuery, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Data Modeling, Google Cloud Platform (GCP), Google Cloud SQL, Data Science, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, Google Cloud Functions, Cloud Run, APIs, Serverless, Data Lakes, Data Visualization, Scaling, Dashboards, Data Wrangling, Database Architecture, ETL Tools

Head of Data Engineering | Machine Learning

2014 - 2018

Surf Media

Led a team of six and was responsible for their personal development.
Designed big data systems and data lakes including tool selection and data modeling.
Designed data pipelines and model selection for the development of recommendation engines and fraud. The recognition systems work in a real-time environment.
Created the technology roadmap. Oversaw the advancement of all affected data systems.

Technologies: TensorFlow, RabbitMQ, Apache Avro, Tableau, Hortonworks Data Platform (HDP), SQL, Apache NiFi, Apache HAWQ, Talend, Python, Data Engineering, PostgreSQL, Amazon S3 (AWS S3), AWS Lambda, Data Architecture, Amazon Web Services (AWS), NoSQL, Data Pipelines, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Talend ETL, Data Science, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, APIs, Spark, Data Visualization, PySpark, Scaling, Dashboards, Data Wrangling, Database Architecture, ETL Tools

Business Intelligence Analyst

2012 - 2014

Surf Media

Designed, developed, and operated a DWH for the company group consisting of five companies.
Developed a statistical model for predicting orders.
Analyzed customers to understand how best to optimize revenue in a social network.

Technologies: Tableau, Perl, Python, MySQL, Data Engineering, PostgreSQL, Data Architecture, Amazon Web Services (AWS), Data Pipelines, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Talend ETL, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, APIs, Spark, Data Visualization, Apache Spark, PySpark, Scaling, Dashboards, Data Wrangling, ETL Tools

Database Consultant

2010 - 2012

EOS Information Services, GmbH.

Designed, developed, and operated a DWH for a Decision Engine used in risk management.
Designed processes for risk management.
Completed conception and development of a process for managing addresses using Perl and Uniserv.

Technologies: Oracle, Java, Perl, Data Engineering, Data Architecture, Data Analysis, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Databases, Business Intelligence (BI), Database Schema Design, Data Management, ETL Tools

Datawarehousing Consultant

2009 - 2010

Key-Work Consulting, GmbH.

Migrated the sales reporting for a mailorder company.
Developed a statistical model to optimize sales planning of a mail order company.
Built a statistical model for a dynamic shipping schedule.

Technologies: Python, SQL, SQL Server 2010, Data Engineering, Data Analysis, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Data Modeling, Databases, Data Reporting, Business Intelligence (BI), Data Management, Dashboards, ETL Tools

Database Management

2008 - 2009

Coxulto Marketing Solutions, GmbH.

Defined and selected target groups for marketing campaigns.
Completed affinity analysis for the complete customer base.
Administered and operated the address database including duplicate termination.

Technologies: Perl, SQL, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Databases, Data Reporting, Business Intelligence (BI), Dashboards, ETL Tools

Lead of Business Intelligence Consumer Products

2007 - 2008

1&1 Internet A

Coordinated and prioritized all tasks of the Business Intelligence team.
Designed and developed KPI reports for the board of directors.
Analyzed customer structures and built a model for churn prediction.

Technologies: Java, Perl, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Databases, Data Reporting, Business Intelligence (BI), Data Visualization, Dashboards, ETL Tools

Business Intelligence Analyst

2003 - 2007

1&1 Internet AG

Designed and developed an automated reporting system for customer and contract inventory, as well as internet usage and customer behavior.
Integrated the customer usage data of the company websites into the DWH.
Coordinated all tasks between management and development departments.
Analyzed all new and existing customer campaigns for effectiveness.

Technologies: Java, MySQL, Perl, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Databases, Data Reporting, Business Intelligence (BI), Data Visualization, Dashboards, ETL Tools

Experience

AVRO Schema Generator

https://gitlab.com/datascientists.info/avro-generator

A Python-based AVRO schema generator I developed myself, that adds the ability to make parts of a schema reusable. This is useful as AVRO does not provide this functionality by itself.

If certain data structures are used in several schemas, this tool provides the ability only to define these structures once and then reuse them over several schemas.

Evalution of Property Value

A Python/TensorFlow-based deep learning model and API I built to predict property prices based on their geolocation and other attributes. The value is predicted in real-time using a Flask REST API integrated on the clients' website.

Design and Set-up of Data Platform

A platform consolidating all relevant data of a social media company, where I designed and helped setting up various tools. This platform provided access to all data in real-time for operational decision support as well as for analytic workloads.

Certifications

AUGUST 2019 - AUGUST 2021

Google Cloud Certified - Professional Data Engineer

Google

Skills

Languages

Python, SQL, Perl, Java, XML, Snowflake, Python 3, TypeScript

Tools

BigQuery, Apache HAWQ, Apache Avro, Git, Apache Beam, Tableau, Apache Airflow, Jenkins, Apache NiFi, RabbitMQ, Microsoft Excel, Terraform, Amazon Elastic MapReduce (EMR), Amazon EKS, AWS IAM, Google Kubernetes Engine (GKE), Talend ETL, Amazon Athena, AWS CloudFormation, Amazon Redshift Spectrum, Matillion ETL for Redshift, AWS Fargate, AWS Glue, GitLab CI/CD

Paradigms

ETL, Business Intelligence (BI), Data Science, DevOps, Microservices

Platforms

Amazon Web Services (AWS), Cloud Run, Linux, Docker, Talend, Hortonworks Data Platform (HDP), Oracle, AWS Lambda, Google Cloud Platform (GCP), Kubernetes, AWS Elastic Beanstalk, Kubeflow

Storage

MySQL, Google Cloud, Database Modeling, Redshift, Databases, Database Architecture, SQL Server 2010, Data Pipelines, Amazon S3 (AWS S3), PostgreSQL, Google Cloud SQL, Data Lakes, Apache Hive, HDFS, NoSQL, Amazon Aurora, JSON

Other

Data Visualization, Data Analysis, Data Architecture, Data Engineering, Data Warehousing, Data Modeling, Data Warehouse Design, Data Reporting, Database Schema Design, Data Management, Google Cloud Functions, APIs, Data Wrangling, ETL Tools, Tableau Server, Google BigQuery, Data Profiling, Google Data Studio, Fivetran, Serverless, Scaling, Dashboards, Amazon Kinesis, Parquet, Cloud Architecture, Big Data, Architecture, Big Data Architecture, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Security, Data Build Tool (dbt), Cloud Tasks, Azure Databricks, Argo CD, FastAPI

Frameworks

Spark, Apache Spark, Flask, Django, Hadoop, Serverless Framework

Libraries/APIs

Pandas, PySpark, TensorFlow, REST APIs, Node.js

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring