Thomas Debray
Verified Expert in Engineering
Data Scientist and Developer
Utrecht, Netherlands
Toptal member since March 18, 2022
Thomas has 17 years of experience in risk modeling and causal inference and has managed over €1 million in research funds as a scientist. Since 2019, he has worked as an independent contractor for various global pharmaceutical companies and CROs. His goal is to improve data-driven decision making by adopting state-of-the-art analysis methods and delivering scientific scrutiny in a timely fashion.
Portfolio
Experience
- MySQL - 15 years
- R - 14 years
- Machine Learning - 12 years
- Meta-analysis - 12 years
- Statistics - 12 years
- Causal Inference - 10 years
- Training & Training Content Development - 10 years
- Health Economics & Outcomes Research (HEOR) - 5 years
Availability
Preferred Environment
R, PHP, Statistics, Machine Learning, Risk Models, Causal Inference
The most amazing...
...tools I've developed are statistical methods, software, and guidelines that major scientific journals and international communities have endorsed.
Work Experience
Senior Statistician
Smart Data Analysis and Statistics
- Provided statistical support during the design and analysis of Phase IV trials, post-authorization safety studies, historical control studies, and pooled studies (e.g., meta-analysis).
- Built a Shiny app to facilitate blinded sample size re-estimation (BSSR) in bio-equivalence studies with multiple primary endpoints and >2 treatment arms.
- Developed an R package for precision medicine. This package is hosted on CRAN and implements a doubly robust precision medicine approach to fit, cross-validate, and visualize prediction models for the conditional average treatment effect.
- Developed, evaluated, and implemented risk prediction models using R and Python.
- Set up advanced simulation studies using GCP and Amazon AWS.
- Managed several data scientists and statisticians to develop training materials on biostatistics and machine learning.
- Developed and maintained the company's main website using PHP and MySQL and implemented various APIs such as Bootstrap, Carousel, Google Charts, and Calendly.
- Edited a handbook guiding conducting comparative effectiveness research and personalized medicine using real-world data.
Contract Senior Biostatistician
Undisclosed Pharmaceutical Company
- Developed a study protocol to create a synthetic control arm for a noninterventional cohort study.
- Reviewed statistical analysis plans for conducting a systematic literature review and network meta-analysis of randomized trials and real-world evidence.
- Critically reviewed available data sources and assessed their utility for generating a synthetic control arm.
Associate Editor
BioMed Central Ltd
- Managed article submissions and editorial peer review for the open-access journal BMC diagnostic and prognostic research.
- Provided feedback to manuscript authors about the required revisions.
- Invited domain experts to provide critical reviews of submitted manuscripts.
Contract Senior Data Scientist
Proalto
- Assisted in the implementation of machine learning methods for credit risk modeling.
- Planned the development of a software platform for automating micro-loans.
- Critically reviewed quotes for the development of the software platform.
Assistant Professor
University Medical Center Utrecht
- Developed the statistical methodology and guidelines for conducting risk prediction and causal inference. Key topics: regression, meta-analysis, multiple imputations, multilevel modeling, Bayesian inference, propensity score analysis, machine learning.
- Created master-of-science courses, workshops, online training modules, and a wiki for education and information provision to international students and staff.
- Built an open-source R software package and maintained the updates and bug fixes via a callable range accrual note (CRAN).
- Set up the advanced simulation studies using Amazon AWS and GCP to evaluate and compare the performance of analytical approaches.
- Developed and validated prediction models using penalized regression, multilevel regression, random forests, XGBoost, neural networks, and support vector machines.
- Acted as a principal investigator for various international projects funded by the European Commission and World Health Organization. Applied for national and international research grants.
- Managed an international team of master-of-science students, Ph.D. candidates, and post-docs and supervised their daily activities.
- Provided critical reviews and analytical support in epidemiological studies.
- Set up new collaborations with international organizations, including universities, healthcare agencies, and pharmaceutical companies.
- Published around 100 peer-reviewed scientific manuscripts.
Scientific Consultant
Undisclosed Health Technology Assessments (HTA) Agency
- Reviewed the validity of health economic models to assess the cost-effectiveness of a new therapy.
- Evaluated the Java source code of a discrete event simulation model to identify computational and coding errors.
- Verified the consistency between the technical report and the parameters and outputs of the discrete event simulation model.
- Provided scientific advice on improving the transparency and usability of the discrete event simulation model.
- Participated in teleconferences to discuss the technical report, disease and clinical area, and appropriateness of the health economic model.
- Reviewed the draft advice report from the client and addressed their queries via mail.
Contract Senior Biostatistician
Undisclosed Nonprofit Association
- Reviewed the study protocol for a systematic literature review and provided feedback on the required analysis steps.
- Conducted a multilevel meta-analysis of published evidence obtained through a literature review.
- Assisted in drafting the final report, preparing a scientific publication, and addressing reviewer comments relating to methodological and statistical inquiries.
- Developed R code to recover missing information from published reports and conducted a meta-analysis.
Contract Senior Biostatistician
Undisclosed Pharmaceutical Company
- Developed, evaluated, and implemented statistical methods for a systematic review and meta-analysis of real-world evidence studies, conducting causal inference, and imputing missing data.
- Participated as a domain expert in advisory panels and discussed existing approaches for developing and validating risk prediction models using data from multiple sources.
- Provided critical input on statistical analysis plans, study designs, statistical approaches, results in interpretation, and supported drafting reports and manuscripts.
- Evaluated the performance of advanced data analysis methods using extensive simulation studies in R and JAGS on the Google Cloud Platform (GCP).
- Prepared a critical overview of existing statistical methods for synthesizing RCTs and observational data and assessed their strengths and weaknesses.
- Developed a statistical framework for predicting individualized treatment effects estimates and conducted simulation studies to evaluate their accuracy.
- Managed several independent consultants to coordinate research and development activities.
- Developed R code for various advanced statistical methods and maintained updates via GIT.
Contract Senior Data Scientist
Infodation B.V.
- Reviewed an R Shiny application to facilitate project planning and management.
- Identified and fixed software bugs using GIT versioning.
- Drafted a technical report with key recommendations for improving the R Shiny application and its long-term sustainability.
- Managed feedback and input from one independent consultant who maintained the R Shiny software.
Software Developer Consultant
Source NV-SA
- Maintained the front end and back end of Source NV-SA, which was aquired by the Tech Data Corporation in 2010.
- Developed and implemented new modules for the content management system (CMS) of the company's main website.
- Developed a web-based Java tool to support customers in identifying an appropriate backup and sizing solution.
Experience
Metamisc | An R Package for Conducting Meta-analysis in Risk Prediction
https://CRAN.R-project.org/package=metamiscI was the leading developer and incorporated functions to conduct a multivariate meta-analysis to summarize estimates of prediction model performance (doi:10.1177/0962280218785504) and to evaluate the presence of publication bias (doi:10.1002/jrsm.1266).
The R package was initially developed to facilitate the education of master's degree-level and Ph.D. students and is now mainly used by researchers embarking on a systematic literature review.
In 2022, the R package has been implemented as a formal extension module for the JASP software.
Improving the Generalizability of Risk Models
Key references:
• https://doi.org/10.1002/sim.5732
• https://doi.org/10.1136/bmj.i6460
• https://doi.org/10.1177%2F0962280216660741
• https://doi.org/10.1016/j.jclinepi.2014.06.018
• https://doi.org/10.1002/sim.5412
Education
Master of Science Degree in Epidemiology
Utrecht University - Utrecht, The Netherlands
PhD in Epidemiology and Biostatistics
Utrecht University - Utrecht, The Netherlands
Master of Science Degree in Artificial Intelligence
Maastricht University - Maastricht, The Netherlands
Master of Science Degree in Computer Science
Hogeschool Gent - Gent, Belgium
Skills
Libraries/APIs
XGBoost
Tools
LaTeX, GitHub, Eclipse IDE, Microsoft PowerPoint, Git, Subversion (SVN)
Languages
R, PHP, Java, HTML, JavaScript, SQL, Python, COBOL, C#, CSS
Platforms
RStudio, Windows, Ubuntu, Linux Mint, Linux, MacOS, Jupyter Notebook, Fedora, Google Cloud Platform (GCP), Amazon Web Services (AWS), Google Ads
Frameworks
RStudio Shiny, ASP.NET
Paradigms
Database Design
Storage
MySQL
Industry Expertise
Bioinformatics
Other
JAGS, WinBUGS, Training & Training Content Development, Statistics, Machine Learning, Bayesian Inference & Modeling, Biostatistics, Regression, Epidemiology, Causal Inference, Meta-analysis, Risk Models, Monte Carlo Simulations, Health Economics & Outcomes Research (HEOR), Literature Review, Data Science, Statistical Data Analysis, Predictive Modeling, Data Analytics, Scientific Data Analysis, Monday.com, Wikis, Clinical Trials, Graphical User Interface (GUI), Database Analytics, Data Mining, Image Processing, Information Retrieval, Signal Processing, Statistical Methods, Statistical Analysis, Markov Model, Markov Chain Monte Carlo (MCMC) Algorithms, Publishing, Data Visualization, Data Analysis, Programming, Big Data, Research, Predictive Analytics, Education, Fintech
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring