
Daniel Beasley
Verified Expert in Engineering
Machine Learning Developer
Amsterdam, Netherlands
Toptal member since August 14, 2022
Daniel is passionate about data analytics and confident in solving problems with machine learning. In the past, he's worked on various machine learning problems, including computer vision, price recommendation, and spectral classification. His best quality in this area is developing practical solutions to business problems. If an 80% solution in a short amount of time, it may be worthwhile to implement it and tackle a new problem.
Portfolio
Experience
- Statistics - 12 years
- Python - 7 years
- Pandas - 7 years
- Scikit-learn - 6 years
- Data Analysis - 6 years
- Jupyter - 5 years
- Machine Learning - 5 years
- SQL - 5 years
Availability
Preferred Environment
Jupyter, Python, PyCharm
The most amazing...
...model I've developed is a classifier to identify pathogens using spectroscopy. The project was end to end, and involved novel methods of analysis and ML.
Work Experience
Senior Marketing Data Scientist
Vinted
- Updating payback calculation model for effective ROI calculation.
- Developed Bayesian MMM models for understanding marketing efficiency.
- Performed montly market reporting and communication with marketing managers.
Data Scientist
Nostics
- Implemented data science models for identifying and classifying pathogens like bacteria and viruses using surface-enhanced Raman spectroscopy.
- Developed a 95% sensitive and 95% specific multiplex bacterial classification algorithm using a combination of principal component analysis (PCA), DBSCAN, and partial least squares regression and deployed it to the AI Platform in Google Cloud.
- Created a custom dashboard using Dash and hosted it on Google App Engine, allowing our researchers to interact quickly with data.
- Researched and experimented with techniques for analyzing high-dimensional spectral data, such as preprocessing, similarity measures, and signal extraction.
Data Science Team Lead
Trivago
- Led a cross-functional team of six data scientists and engineers developing data science solutions for features relating to price competitiveness.
- Oversaw the engineering development of the weekend search functionality. This was a challenging feature as it bypassed the original Trivago search and let users search for trips in a variety of places and times based on their value and appeal.
- Developed and implemented the Trivago Price Index, a user-facing scale to assess a given deal's value for money.
Data Scientist
Trivago
- Developed an autoencoder and keypoint-based solution to de-duplicate image galleries and optimized the solution to evaluate 300 million pairs of images.
- Trained and implemented a deep learning-based image quality score using TensorFlow and Amazon SageMaker.
- Developed custom KPI dashboards using Impala and Hive.
- Trained and deployed over 90% precise hotel-specific image tagging models using TensorFlow and AWS.
Experience
Bacteria Classifier
Principal component analysis was used to identify outliers in the data. From PCA, one can calculate the Q-residual and Hotelling's T-squared. Along with the Mahalanobis distance, these statistics make for effective high-dimensional outlier detection. DBSCAN was used to segment the high-dimensional space. This was necessary because some bacteria had two distinct signatures, which would confuse a classifier that assumes they are similarly distributed. Partial least squares regression was used on each DBSCAN cluster to further subdivide the high dimensional space. Altogether this led to a highly specific and sensitive classifier. I packaged the trained classifiers in Python and deployed it all to the AI Platform in Google Cloud.
Education
Master's Degree in Mathematics (Probability and Statistics)
Vrije Universiteit Amsterdam - Amsterdam, Netherlands
Bachelor's Degree in Physics
University of Waterloo - Waterloo, Canada
Certifications
Machine Learning Engineer
Udacity
Skills
Libraries/APIs
Pandas, Scikit-learn, NumPy, TensorFlow
Tools
Jupyter, PyCharm, Impala, Amazon SageMaker, Looker
Languages
Python, SQL, C++, R
Paradigms
ETL, Linear Programming, Management
Platforms
Amazon Web Services (AWS), Google Cloud Platform (GCP)
Storage
Google Cloud, PostgreSQL, Apache Hive
Frameworks
Hadoop
Other
Data Analysis, Calculus, Statistics, Probability Theory, Machine Learning, Artificial Intelligence (AI), Data Science, Data Modeling, Data Mining, Data Analytics, Data Visualization, Technical Hiring, Code Review, Source Code Review, Task Analysis, Neural Networks, Large Data Sets, Data Manipulation, Data Extraction, Data Collection, Jupiter, Data Wrangling, Mathematical Modeling, Algorithms, Physics, Optimization, Statistical Modeling, Clustering, Data Reporting, Interviewing, Team Management, Computational Biology, Linear Optimization, DBSCAN, K-means Clustering, Clustering Algorithms, Bayesian Statistics, Time Series, Quantum Computing, Stochastic Modeling, Data Engineering, Computer Vision, Convolutional Neural Networks (CNNs), Classification, Regression, Principal Component Analysis (PCA), Biology, Marketing Mix
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring