
Peter Papai
Verified Expert in Engineering
Data Scientist and Software Developer
Bangkok, Thailand
Toptal member since June 4, 2020
With a PhD in physics, Peter is a developer working in the field of data science. He has five years of full-time experience working on big data projects at a large internet company. Peter has formulated business goals and designed, prototyped, productized, and A/B-tested machine learning algorithms in several areas. His insights gleaned from data have helped stakeholders make impactful business decisions.
Portfolio
Experience
- Python - 10 years
- Machine Learning - 10 years
- Statistics - 10 years
- Spark - 4 years
- Deep Learning - 4 years
- Scala - 4 years
- A/B Testing - 3 years
- PyTorch - 2 years
Availability
Preferred Environment
Git, IntelliJ IDEA, PyCharm, Visual Studio Code (VS Code), Google Cloud Platform (GCP), Spark, Hugging Face, PyTorch
The most amazing...
...thing I've done is to formulate a mathematical framework to price hotel rooms, which has become an essential driver of business for my employer.
Work Experience
Data Scientist
DiagnoseEarly
- Developed the scientific components of the software that identified toxins in the mass spectrum of human breath.
- Dockerized the solution to make it available as a microservice.
- Researched the scientific literature on mass spectroscopy, made recommendations, and conducted feasibility studies to assist the early-stage startup in defining its product roadmap.
- Wrote algorithms to process heart rate data from wearable devices as a component of a health care app.
Lead Data Scientist
Tokopedia
- Improved the ranking algorithm on the search page, which increased the number of orders by around 1% overall according to A/B tests.
- Enhanced the A/B testing framework, rectifying the harm done by many erroneous past experiments.
- Refined the relevance of keyword targeting ads and increased the ad revenue by 2%.
- Collaborated with the tech team to deploy models in production on GCP.
- Built ETL pipelines to create features using BigQuery and Dataflow.
Lead Data Scientist
Agoda
- Developed a system to monitor thousands of time series for anomaly detection.
- Improved fraud detection using machine learning and new A/B testing strategies.
- Provided mentoring for less experienced data scientists.
- Communicated with stakeholders, worked on roadmaps, and defined KPIs for projects.
- Helped the tech team to deploy deep learning models in production.
Senior Data Scientist
Agoda (Booking Holdings)
- Served as a core member of back-end teams following the agile methodology, including Scrum, Jira, and Git pull requests, among others.
- Turned business goals into math objectives and implemented algorithms to optimize them for pricing.
- Implemented content and collaborative filtering-based algorithms for ranking.
- Applied time series prediction techniques for demand forecasting.
- Cooperated with tech teams to put into production models written in Scala or Python using a variety of frameworks and tools.
- Built ETL pipelines for feature engineering, mainly using Spark.
Senior Data Analyst
IO Technologies
- Prototyped a model for clickthrough rate prediction, satisfying the architectural constraints of the company.
- Used the cloud-based stack (AWS) of the company to produce the model.
- Provided training about machine learning and data science for coworkers.
Quantitative Researcher
WorldQuant
- Researched the scientific literature for ideas for automated trading.
- Implemented predictive models using different data sources, such as historical stock returns, news articles, etc.
- Tested predictive algorithms offline, using historical data.
Postdoctoral Researcher
The Abdus Salam International Centre for Theoretical Physics (ICTP)
- Published research papers in the field of cosmology.
- Helped create teaching materials for undergraduates.
- Analyzed galaxy survey data to constrain theoretical models of the universe.
Experience
Cost Reduction in Fraud Detection
Dynamic Pricing of Hotel Rooms
Click-through Rate Prediction for Online Advertising
Personalized Ranking
Inventory Matching
Anomaly Detection – Time Series
Education
PhD in Physics
University of Hawaii at Manoa - Honolulu, Hawaii, USA
Master of Science Degree in Physics
ELTE - Budapest, Hungary
Certifications
Data Engineering. AI Data Engineering
Amazon Web Services | via Coursera
Generative AI with Large Language Models
Coursera
Natural Language Processing Specialization
Coursera
Generative Adversarial Networks (GANs) Specialization
Coursera
AI for Medicine Specialization
Coursera
Self-Driving Cars Specialization
Coursera
Skills
Libraries/APIs
Spark ML, NumPy, Pandas, PyTorch, Scikit-learn, SciPy, Hugging Face Transformers
Tools
Git, PyCharm, IntelliJ IDEA, Mathematica, BigQuery
Languages
Python, Python 3, Scala, SQL, C++, PMML
Paradigms
ETL, Anomaly Detection
Frameworks
Spark, Apache Spark, Flask
Platforms
Visual Studio Code (VS Code), Amazon Web Services (AWS), Google Cloud Platform (GCP), Docker, Jupyter Notebook
Storage
Apache Hive, HDFS, NoSQL, Data Pipelines
Other
Machine Learning, Mathematics, Data Science, Physics, Artificial Intelligence (AI), Supervised Machine Learning, Data Analysis, Data Cleansing, Data Scientist, Statistics, Deep Learning, Time Series, Natural Language Processing (NLP), Big Data, Recommendation Systems, DBSCAN, Clustering, Clustering Algorithms, K-means Clustering, Google BigQuery, Data Processing, Decision Trees, Scripting, A/B Testing, Machine Vision, Time Series Analysis, Click-through Rates (CTR), Naive Bayes, Advertising, Revenue Optimization, Operations Research, Generative Pre-trained Transformers (GPT), Hugging Face, Risk Modeling, Optimization, Scientific Computing, GC Mass Spectrometry, Computer Vision, Robotics, Large Language Models (LLMs), Transformer Models, LoRa, AI Agents, LangChain, Algorithmic Trading, Data Engineering, Vector Databases, Streaming Data, Data Warehousing, Data Visualization
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring