Saikat Banerjee, Developer in Chicago, IL, United States
Saikat is available for hire
Hire Saikat

Saikat Banerjee

Verified Expert  in Engineering

Linear Regression Developer

Chicago, IL, United States
Toptal Member Since
July 29, 2022

Saikat is a postdoctoral scientist at the University of Chicago with a PhD in computational biophysics and a master's degree in chemistry. He is an expert in biostatistics, statistical genetics, Bayesian methods, and machine learning. As a graphic design and web development freelancer, Saikat co-founded a marketing management company. He enjoys solving problems, creating value, and learning new expert-level skills.



Preferred Environment

Ubuntu, Python, C++

The most amazing...

...method I've developed helped scientists to discover the network of human genome transcriptional regulation.

Work Experience

Postdoctoral Scientist

2020 - PRESENT
The University of Chicago
  • Led multiple projects on Bayesian statistics with international collaborations and challenging deadlines.
  • Developed machine learning algorithms for sparse multiple regression.
  • Introduced gradient descent technique for variational inference.
Technologies: Statistical Methods, Bayesian Statistics, Linear Regression, Logistic Regression, Predictive Modeling, Machine Learning

Postdoctoral Scientist

2015 - 2020
Max Planck Society
  • Developed statistical methods to understand disease mechanisms from large-scale biomedical data.
  • Collaborated with medical doctors leading to two peer-reviewed publications.
  • Presented our work at the 2019 International Society for Computational Biology conference and 2020 e:Med; invited to hold a visiting lecture at the University of Göttingen.
  • Supervised a master's thesis and mentored three internship students.
Technologies: Bayesian Statistics, Statistical Methods, Linear Regression, Logistic Regression, Predictive Modeling, Machine Learning

Trans-eQTL Discovery from GTEx Data
Genetic variants regulating distant target genes are called trans-acting expression quantitative trait loci (trans-eQTLs). Many genetic variants are believed to mediate disease risk via the trans-eQTLs. It is crucial to discover trans-eQTLs and understand their mechanism to reveal the genetic variants' link to disease phenotypes. It is challenging to identify trans-eQTLs due to small effect sizes, tissue specificity, and a severe multiple-testing burden.

Our goal was to develop a reliable method of identifying trans-eQTLs. We proposed a new model and created open-source software. Applying our method to the eQTL data from the Genotype-Tissue Expression Project (GTEx) proved its performance is significantly better than the state-of-the-art.

Bayesian Multiple Logistic Regression
Logistic regression is the method of choice to analyze binary outcomes. Multiple logistic regression uses numerous variables in a logistic model. Bayesian multiple logistic regression offers several benefits, including variable selection, prediction, easier interpretation of results, and leveraging prior information. However, Bayesian multiple logistic regression requires costly and technically challenging Markov Chain Monte Carlo (MCMC) sampling or approximations that significantly reduce the logistic model's flexibility.

We proposed a methodology using the point-normal prior for faster and more accurate Bayesian multiple logistic regression, developing open-source software for the project. Applying our method to human genetics data, we proved it outperforms state-of-the-art variable selection and prediction for sparse multiple logistic regression problems of high dimension (n >> p problems.)


Python, Bash, HTML, PHP, CSS, Fortran, C++, Hugo, CSS3


NumPy, SciPy, Scikit-learn, Matplotlib, MPI, OpenMP


Jupyter, Shell, Adobe Illustrator, GitHub, Adobe Photoshop


Ubuntu, Linux, Debian


Bayesian Statistics, Statistical Methods, Linear Regression, Logistic Regression, Biostatistics, Predictive Modeling, Machine Learning, Research, Mechanics, Generalized Linear Model (GLM), Mixed-effects Models, Biophysics, Data Analysis, Computational Biological Physics


Data Science, Parallel Programming



2010 - 2015

PhD in Computational Biophysics

Indian Institute of Science - Bangalore, India

2007 - 2010

Master's Degree in Chemistry

Indian Institute of Science - Bangalore, India