Balint Kubik
Verified Expert in Engineering
Data Engineer and Developer
Balint is a versatile senior data engineer who has used cloud-hosted technologies to solve data use cases in the corporate and academic sectors. He has implemented data architectures from scratch, enabled advanced reporting capabilities, and supported data science use cases on AWS, Microsoft Azure, and Google Cloud Platform. Driven to produce well-tested, maintainable software, Balint is experienced in Python and Scala programming languages and several (No)SQL dialects.
Portfolio
Experience
Availability
Preferred Environment
Linux, PyCharm, Visual Studio Code (VS Code)
The most amazing...
...role I've had was driving the build of a large-scale, fully cloud-hosted data warehousing system that hosts eCommerce data for large clients.
Work Experience
Senior Data Engineer
12traits
- Streamlined the movement, processing, and transformation of rich, behavioral big data from a number of large clients from the gaming and health industries with over 300 million EUR in revenue.
- Enabled the performant, scalable access to business-critical KPIs derived from hundreds of GBs of data through back-end APIs.
- Introduced the usage of modern batch and stream-processing pipelines and workflow scheduling engines in the organization.
Senior Data Engineer
Cleverbridge AG (Freelance)
- Supported the release of a Microsoft Azure-hosted reporting product to one of the company's top-three enterprise clients, a company generating $300+ million in annual revenue.
- Assisted with scaling the calculation of exhaustive eCommerce KPIs, which increased speed by approximately 80%.
- Implemented state-of-the-art security best practices in Microsoft Azure to protect business-sensitive information and share data with external parties.
- Improved the scalability and monitorability of a large reporting system through consistent QA testing and efficient backfilling mechanisms.
Data Engineer
Cleverbridge AG
- Planned and implemented a data warehousing system for reporting and analytics on Microsoft Azure for enterprise clients that generated $400+ million in aggregate annual revenue.
- Managed the fully cloud-hosted environment using infrastructure as code (IaC); designed and implemented database schema; and built ETL pipelines for processing granular, eCommerce datasets comprising hundreds of millions of rows of data.
- Communicated product goals to internal and external stakeholders and managed the backlog of a three-person Agile development team.
Software Developer
Starschema Ltd
- Automated provisioning and recovery mechanisms of Hadoop and Tableau clusters hosted on AWS for Fortune 500 clients.
- Deployed image classification for anomaly detection in power plants for one of the largest industrial companies in the world.
- Implemented a solution to host containerized (Dockerized) Apache Kafka on Apache Mesos.
Researcher
Hungarian Academy of Sciences
- Collected, processed, and performed text analysis on large corpora consisting of millions of sentences derived from audio recordings covering more than five days.
- Presented research results at the International Conference on Computational Social Science in 2018, the largest conference of its type in the world.
- Mapped the network of pieces of Hungarian legislation using text mining techniques. The findings were published in a scientific publication.
Experience
Large-scale, Cloud-hosted Data Warehouse for eCommerce
Role:
- Managed the cloud environment (IaC).
- Built ETL pipelines.
- Managed database schema.
- Built standardized reporting dashboards and performed ad hoc analytics.
- Developed Python microservices.
- Calculated eCommerce (subscription) KPIs.
- Communicated with internal and external stakeholders.
Skills
Languages
SQL, Python, R, Go
Frameworks
Apache Spark, Hadoop, Django
Tools
Microsoft Power BI, Apache Airflow, Tableau, PyCharm, Apache Mesos, BigQuery, Apache Beam
Paradigms
ETL, Database Development, Test-driven Development (TDD)
Platforms
Docker, Kubernetes, Amazon Web Services (AWS), Databricks, Google Cloud Platform (GCP), Visual Studio Code (VS Code), Linux
Storage
Azure SQL, Data Lakes, Data Pipelines, Databases, Database Management, PostgreSQL, Google Cloud, Elasticsearch
Other
Data Analysis, Microsoft Azure, Azure Data Factory, Google BigQuery, Data Engineering, Data Quality, Data Modeling, Reports, Data Warehousing, Data Architecture, DAX, Data Warehouse Design, Azure Data Lake, Security, Data Visualization, Research, Image Processing, Metabase
Education
Master's Degree in Economics
Eötvös Lóránd University - Budapest, Hungary
Bachelor's Degree in Political Science
Corvinus University of Budapest - Budapest, Hungary
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring