Expert Data Scientist
2020 - PRESENTAda Health- Developed a synthetic patient case generator based on symptoms-disease networks. The project enabled new downstream analyses, such as demonstrating the effectiveness of an active learning model before deploying it to the customer.
- Created methods for forecasting individual disease risk. Implemented risk predictors for diabetes and cardiovascular diseases.
- Co-developed an NLP pipeline for extracting medical information from scientific articles (diseases, symptoms, risk factors, and epidemiological information) and implemented corresponding information architecture.
- Investigated the potential use of polygenic risk scores in improving individual disease risk prediction. Planned and executed experiments for evaluating polygenic risk scores via UK Biobank data.
- Organized bi-weekly meetings to exchange knowledge about AI and data science topics across the company.
- Mentored other data scientists across the company.
Technologies: Python, Google Cloud, SciPy, NumPy, Pandas, Deep Learning, Java, JavaScript, SQL, Bayesian Machine Learning, SpaCy, NLTK, Machine LearningData Scientist
2019 - 2019Causaly- Aggregated and curated multiple biomedical ontologies into one coherent knowledge graph.
- Developed a knowledge graph model and convolutional inference scheme for inference on a biomedical network graph constructed from several ontologies.
- Applied the inference method for predicting side effects of recently approved drugs.
Technologies: Python, TensorFlow, SciPy, NumPy, Pandas, Deep Learning, Machine Learning