
Shing Chan
Verified Expert in Engineering
Machine Learning Developer
Asuncion, Paraguay
Toptal member since October 27, 2021
Shing is a researcher/developer with extensive experience building ML systems across various industries: healthcare (risk scores, sensor analytics, epidemiology), marketing (CLV, churn models), finance (index replication, trading systems), sports analytics (NBA/NFL), geophysics (well placement, reservoir modeling), and aeronautics (GPU fluid sims). He holds a PhD (2018) in physics-informed ML and generative AI for oil and gas engineering and is currently at Oxford, advancing AI health analytics.
Portfolio
Experience
- Python - 10 years
- Keras - 9 years
- Physics Simulations - 9 years
- PyTorch - 9 years
- Machine Learning - 9 years
- Deep Learning - 9 years
- Generative Adversarial Networks (GANs) - 8 years
- Time Series Analysis - 5 years
Availability
Preferred Environment
Linux, Git, PyTorch, Keras, Bash, Fortran, Vim Text Editor, Amazon Web Services (AWS), Python
The most amazing...
...thing I've developed is a generative AI method for geomodelling, published in CompGeosci 2019.
Work Experience
Researcher
IDX Digital Assets
- Developed a replication model for the Refinitiv Venture Capital Index.
- Built a medium/low-frequency trading system for digital assets based on features derived from the price series and macroeconomic factors.
- Implemented indicators for trend and risk management to reduce drawdown.
Researcher
University of Oxford
- Created PyPI packages for wearable sensor analytics, used by hundreds of researchers as well as pharmaceutical companies (GlaxoSmithKline, Novo Nordisk, Johnson & Johnson).
- Researched the added value of alternative data (wearable sensors) on existing clinical risk models (https://qrisk.org/).
- Researched the use of deep learning methods to extract behavioral insights from wearable sensor data.
- Co-developed the 1st tera-scale foundation model for accelerometer data (trained on 700,000 person-days of sensor data via self-supervised learning).
- Built pipelines for large-scale multi-node multi-gpu training of deep learning models (over 20 terabytes of data).
- Built comprehensive evaluation pipelines for accelerometer-based activity recognition, comparing different models (CNNs, LSTMs, HMMs, tree models) on various open datasets.
- Applied time-to-event methods (Cox regression, survival forests) to model time to hospitalization or death based on patient characteristics and alternative data such as wearable sensor data (0.6-0.7 C-index).
Senior Data Scientist
EMoodie
- Prototyped mental health monitoring using speech-based emotion recognition and change-point detection.
- Built pipelines for noise reduction, audio enhancement, and anomaly detection.
- Co-wrote government grant proposals, successfully securing £500,000 in funding.
Researcher
subconscious.ai
- Developed methodologies based on large language models for generating synthetic respondents for survey simulations, with an emphasis on realistic demographic profiles.
- Built LLM-based tools to format and summarize academic papers on conjoint analysis and survey-based market research.
- Conducted prompt engineering experiments to replicate findings from conjoint analysis and market research studies.
Machine Learning Engineer
Yofi
- Adapted the Buy Till You Die model for customer churn and lifetime value, simplifying the Beta-Geometric/NBD formulation for efficient implementation in SQL for real-time eCommerce applications.
- Enhanced a bot detection model by integrating features derived from telemetry and sensor data, reducing bad actors and low-value eCommerce customers.
- Developed predictive models and built pipelines for model training.
Machine Learning Expert
KEG Systems LLC
- Researched novel features (e.g., player-player, player-team, team-team interaction features) to predict game outcomes for sports betting (e.g., money line, over-under, and spread), emphasizing calibration to inform bet sizing and risk management.
- Created reproducible pipelines for daily retraining, including feature selection, fine-tuning, and pruning.
- Oversaw deployment and decision-making, betting with real money and tweaking metamodels based on feedback.
PhD Candidate
Heriot-Watt University
- Developed a physics-informed machine learning model to speed up computationally expensive Monte Carlo fluid simulations.
- Developed a novel framework for geological reconstruction based on generative models (e.g., GANs, VAEs) to enhance geological realism for improved accuracy of oil production forecasts in Bayesian history matching.
- Created Python packages for subsurface fluid simulations.
Engineering Intern
FAdeA
- Assisted in the maintenance and repair of aircraft components.
- Assessed the capabilities of aircraft repair stations, making sure tools and procedures were in order according to technical manuals.
- Issued reports documenting deviations from technical manuals, including changes in procedures, the use of original equipment manufacturer (OEM), or refurbished parts.
Research and Development Intern
Instituto Universitario Aeronáutico
- Contributed to an in-house software for viscous flow simulation, extending it with the arbitrary Lagrangian-Eulerian formulation on unstructured grids.
- Identified bottlenecks in the simulation software and parallelized them with OpenMP where possible.
- Ported code sections with CUDA Fortran to enable GPU acceleration, resulting in more than 10 times the speed-up.
Experience
Package for Processing and Analysis of Wearables' Data for Health Analytics
https://github.com/activityMonitoring/biobankAccelerometerAnalysisNumerical Optimization with Natural Evolution Strategies
https://github.com/chanshing/xnesSynthesis of Geological Images
https://github.com/chanshing/geoconditionPhysics-informed Machine Learning for Accelerated Simulations
https://www.sciencedirect.com/science/article/abs/pii/S0021999117307933?utm_medium=emailEducation
PhD in Petroleum Engineering
Heriot-Watt University - Edinburgh, United Kingdom
Engineer's Degree in Aerospace Engineering
Instituto Universitario Aeronáutico - Cordoba, Argentina
Certifications
Financial Markets
Yale University | via Coursera
Heterogeneous Parallel Programming
University of Illinois | via Coursera
Programming Mobile Applications for Android Handheld Systems
University of Maryland | via Coursera
Skills
Libraries/APIs
PyTorch, Keras, Scikit-learn, TensorFlow, OpenMP, XGBoost
Tools
Git, MATLAB, Vim Text Editor, AWS CloudFormation
Languages
Python, Java, Fortran, Bash, C, SQL, R
Platforms
Linux, NVIDIA CUDA, Android, Amazon Web Services (AWS), Shopify
Paradigms
Parallel Programming, Anomaly Detection
Storage
PostgreSQL, Elasticsearch, MongoDB
Other
Physics Simulations, Machine Learning, Deep Learning, Time Series Analysis, Generative Adversarial Networks (GANs), Artificial Intelligence (AI), Time Series, Predictive Modeling, Data Analytics, Algorithmic Trading, Algorithms, Signal Processing, Data Mining, Natural Language Processing (NLP), Numerical Methods, Numerical Analysis, Physics, GPU Computing, Computational Fluid Dynamics (CFD), Computer Vision, Data Science, Scientific Computing, Finance, Aerodynamics, Numerical Simulations, Optimization, Convolutional Neural Networks (CNNs), Wearables, Fitness Trackers, Risk Models, Variational Autoencoders, Recurrent Neural Networks (RNNs), Health, Aerospace & Defense, Aircraft & Airlines, Engineering, Trading, Gambling, Sports, Data Scientist, WandB, OpenAI, Large Language Models (LLMs), Monte Carlo Simulations, Hugging Face, Customer Lifetime Value (CLV), Churn Analysis, Forecasting, Data Engineering, Electronic Health Records (EHR)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring