Bence Mélykúti, Developer in Freiburg, Baden-Württemberg, Germany
Bence is available for hire
Hire Bence

Bence Mélykúti

Verified Expert  in Engineering

Data Scientist and Algorithm Developer

Location
Freiburg, Baden-Württemberg, Germany
Toptal Member Since
June 18, 2020

Bence is a data science, machine learning, and statistics expert with ten years of experience in applied mathematics and probability theory (PhD from Oxford and internationally acclaimed research). He specializes in data analysis, AI and geoinformatics, and non-standard/hard problems requiring mathematical modeling and algorithm development. His highlight projects include the quality assessment of satellite imagery and the audit and improvement of a search advertising bid optimization platform.

Portfolio

Neural Search Labs GmbH
Pandas, Marketing Analytics, Ad Optimization, Optimization, Modeling, Linux...
DataMilk
A/B Testing, Python, BigQuery, Google BigQuery, Google Data Studio, Looker...
Freelance
Geoinformatics, Spark, Pandas, GDAL, OSGeo, Overpass, GDAL/OGR...

Experience

Availability

Part-time

Preferred Environment

TensorFlow, SQL, Pandas, PySpark, Google Cloud Platform (GCP), Python, Scikit-learn, MATLAB, XGBoost, Keras

The most amazing...

...project I've worked on was being an equal, valued member of an ex-Googler's startup where I was in charge of building the data science platform.

Work Experience

Lead Data Scientist and Mathematician

2020 - PRESENT
Neural Search Labs GmbH
  • Audited the back-end algorithms and an underpinning model of a search advertising optimization provider that serves online retailers by bidding on their behalf.
  • Proposed a new element that represented a novel methodology and was included in their development plan.
  • Developed and delivered a machine learning solution for production that predicts visitor purchasing behavior on the retailer's site. This is used to set bids strategically.
  • Architected and developed code to communicate with Amazon and BOL.com's API in order to verify user credentials, create advertising campaigns, update campaign parameters (bids), download performance reports, and store the information in a database.
Technologies: Pandas, Marketing Analytics, Ad Optimization, Optimization, Modeling, Linux, Docker, Python 3, Python, Scikit-learn, Mathematical Modeling, XGBoost, MySQL, Data Science

Technical Lead in Data Science

2021 - 2023
DataMilk
  • Developed statistical methodology and software architecture. Added new functionalities to analytical data processing pipelines that ran the A/B testing framework with high-quality, unit-tested Python code.
  • Ideated and supervised the creation of all internal data dashboards that use the output of the above pipelines, both for operations and to monitor the financial health of the company. The dashboards were monitored daily by our C-level executives.
  • Oversaw the gradual integration of new visualizations from the internal dashboards into the customer-facing front end.
  • Conducted several complex, ad hoc studies using SQL in BigQuery, and Python in Google Colab notebooks to understand anomalous data and to develop data cleaning methods. Wrote reports and formulated recommendations to the management.
  • Improved methodology and code quality through mentoring, design reviews, and code reviews. My attention to detail brought accolades.
  • Led a team of another data scientist and a data engineer to good productivity with a very low bug rate by backlog refinement, ticket grooming, and prioritization, leading the daily stand-ups and code reviews.
  • Developed into a cross-functional information hub about the internal workings of various software elements outside of my direct area, thanks to my initiative and desire to understand interconnections and ensure quality.
Technologies: A/B Testing, Python, BigQuery, Google BigQuery, Google Data Studio, Looker, Google, Google Cloud Platform (GCP), Google Colaboratory (Colab), Statistics, Data Visualization, Code Review, Agile, Data Science, Data Analysis, Jira, Confluence

Data Scientist for Remote Sensing

2019 - 2020
Freelance
  • Contributed ideas and research, developed a self-contained methodology, and wrote a Python program for image quality assessment for Sentinel-2 multispectral satellite imagery to be a part of a secure data hosting service for Hungarian authorities.
  • Processed 4,000GB of historical image data on a multicore Google Cloud virtual machine to collect historical data with which to compare the assessed products.
  • Packaged the program in a Docker container and earned praise for the product being compact and easy to operate.
  • Used GDAL with geographic raster images. Referenced different coordinate systems, found details in images by their coordinates, and opened only parts of large image files. Integrated intelligent cloud and snow detection.
  • Learned the fundamental physics and principles of interferometric synthetic-aperture radars (radar satellites) in nine days. Owing to my instructions, my project manager discovered that one of his expert advisors was an authority on the question.
Technologies: Geoinformatics, Spark, Pandas, GDAL, OSGeo, Overpass, GDAL/OGR, Overpass Query Language (Overpass QL), GeoPandas, Modeling, Linux, PySpark, Apache Spark, Google Cloud Platform (GCP), Docker, Shapely, GIS, Python 3, Python, Mathematical Modeling, Data Science, Data Analysis, Geospatial Data, Geospatial Analytics

Postdoctoral Research Fellow

2012 - 2017
University of Freiburg (Germany)
  • Acquired research funding from the prestigious Alexander von Humboldt Foundation and the AXA Research Fund with my research proposals, demonstrating independent and creative thinking at the forefront of human knowledge.
  • Managed the project independently, including research and problem-solving, budget planning, financial reporting, and dissemination of results in publications and at conferences.
  • Solved both practically motivated and theoretical problems, which resulted in two publications in international scientific journals.
  • Developed a novel statistical method for the assessment of cross-contamination in digital PCR experiments in cutting-edge lab-on-a-chip devices. Implemented the algorithm in MATLAB with a student I supervised. Released the software on GitHub.
Technologies: Applied Mathematics, Mathematics, Time Series Analysis, Time Series, LaTeX, Probability Theory, Modeling, Linux, Mathematical Modeling, Mathematica, MATLAB, Stochastic Differential Equations, Discrete Mathematics

Postdoctoral Researcher

2010 - 2011
University of California Santa Barbara (UCSB)
  • Worked at the department of mechanical engineering under the supervision of two renowned control engineers, professors Mustafa Khammash and Joao Hespanha, demonstrating the versatility and ability to learn quickly.
  • Researched stochastic models of biochemical reaction systems.
  • Published my results in a prestigious scientific journal, the Journal of the Royal Society Interface, and the paper has achieved over 50 citations.
Technologies: Applied Mathematics, Mathematics, Time Series Analysis, Time Series, LaTeX, Probability Theory, Modeling, Linux, Mathematical Modeling, MATLAB, Stochastic Modeling, Dynamic Analysis, Stochastic Differential Equations

Building the Statistical Evaluation Pipeline of an A/B Testing Platform

On the foundation of a batch processing data pipeline skeleton, I built the A/B test evaluation framework. I designed the statistical methodology (statistical hypothesis testing, Bootstrap sampling to estimate confidence intervals, and outlier removal) and implemented much of it. I played a central role in exposing this data on dashboards.

Optimization of Search Advertising for Online Retailers

I researched, designed, and implemented a machine-learning algorithm that predicts the purchasing behavior of online search users. Based on predictions of transaction volume and/or expected revenue, we set bids intelligently to maximize user growth and profit.

Quality Assessment for Optical and Radar Satellite Imagery

I contributed ideas and research, developed a self-contained methodology, and wrote a Python program that uses OSGeo (GDAL, OGR) for image quality assessment of Sentinel-2 multispectral satellite imagery to be a part of a secure data hosting service for Hungarian authorities.
I proposed to use tropospheric water content and ionospheric activity to forecast the quality of Sentinel-1 interferometric synthetic-aperture radar images. My guidance led my project manager to obtain an offer from the leading Hungarian expert on the subject.

Languages

Python, Python 3, SQL, JavaScript

Libraries/APIs

Pandas, GDAL/OGR, TensorFlow, Scikit-learn, XGBoost, Keras, Shapely, PySpark, Beautiful Soup, GDAL, WebExtensions API

Tools

BigQuery, LaTeX, MATLAB, Git, GIS, Jupyter, Jira, OSGeo, Mathematica, Looker, Confluence

Paradigms

Data Science, Linear Programming, Agile

Other

Optimization, Mathematical Modeling, Discrete Mathematics, Dynamic Analysis, Machine Learning, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Probability Theory, Stochastic Modeling, Stochastic Differential Equations, Data Analysis, Complex Data Analysis, Data Visualization, Analytics, Modeling, Statistics, Bayesian Statistics, Data Analytics, Big Data, Algorithms, Google BigQuery, Mathematics, Applied Mathematics, Statistical Modeling, Quantitative Modeling, Numerical Simulations, GeoPandas, Overpass Query Language (Overpass QL), Ad Optimization, Time Series, Time Series Analysis, Investing, Visualization Tools, Operations Research, Computational Statistics, A/B Testing, Google Data Studio, Google Colaboratory (Colab), Tkinter, PDAL, Marketing Analytics, Overpass, Geoinformatics, Google, Pulumi, Code Review, Geospatial Data, Geospatial Analytics

Platforms

Jupyter Notebook, Google Cloud Platform (GCP), Linux, Docker

Storage

PostgreSQL, Neo4j, MySQL

Frameworks

Apache Spark, Spark

2006 - 2011

Ph.D. in Department of Statistics

University of Oxford - Oxford, UK

2001 - 2006

Master of Science Degree in Mathematics

Eötvös Loránd University (ELTE) - Budapest, Hungary

MAY 2018 - PRESENT

Data Engineering on Google Cloud Platform, a 5-course Specialization by Google Cloud on Coursera

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring