Raisa Dzhamtyrova, PhD, Machine Learning Developer in Reading, United Kingdom
Raisa Dzhamtyrova, PhD

Machine Learning Developer in Reading, United Kingdom

Member since October 7, 2022
Raisa has a PhD in machine learning and extensive research and industry experience. She has worked as a researcher at the UK's national institute for machine learning and AI and has five years of industry experience in data science and risk modeling. Raisa is proficient in Python, pandas, scikit-learn, R, SQL, Git, pytest, MLflow, Weights & Biases, GitHub Actions, FastAPI, AWS EC2, S3, and other tools.
Raisa is now available for hire

Portfolio

Experience

Location

Reading, United Kingdom

Availability

Part-time

Preferred Environment

Python 3, R, SQL, Scikit-learn, Machine Learning, Data Science, Visualization, Real-time Data, Risk Models, Credit Risk

The most amazing...

...project I've developed is a new method of aggregating anomaly detection algorithms, which is to be used by digital identity provider companies.

Employment

  • Lecturer in Computer Science

    2021 - 2022
    Royal Holloway, University of London
    • Conducted research focused on conformal predictors, a method that can provide probabilistic prediction instead of a point forecast. Presented the work to help attract collaboration from the industry.
    • Taught CS Database Systems (PostgreSQL) and CS Machine Fundamentals and supervised students' final-year projects.
    • Co-organized research seminars on advanced topics in Big Data.
    Technologies: Databases, SQL, Machine Learning, Data Science, Mathematics, Python 3, Scikit-learn, Real-time Data, Programming, Python, Data Analysis, Data Visualization, NumPy, Pandas, Ensemble Methods, Predictive Modeling, Predictive Analytics, Time Series, Time Series Analysis, Regression Modeling, Classification, Regression, Statistical Analysis, Version Control, Git, Statistics, Jupyter, Research, Technical Writing, Statistical Modeling, Data Analytics, Predictive Learning, Linear Regression, PostgreSQL, MySQL
  • Postdoctoral Research Associate

    2020 - 2021
    The Alan Turing Institute
    • Developed a new approach for dynamic cyber risk estimation that assesses the maximum number of hacking attempts with the desired confidence. I designed the prototype in R and published the research in Springer.
    • Created a new approach for aggregating unsupervised anomaly detection algorithms, which is to be used by digital identity provider companies. I developed the prototype in Python.
    • Presented my work at conferences, providing more exposure to the project and attracting more collaboration.
    Technologies: Machine Learning, Anomaly Detection, Risk Models, Python, Data Analysis, Data Visualization, Matplotlib, StatsModels, Visualization, Scikit-learn, Mathematics, Python 3, Real-time Data, Programming, Risk, Amazon S3 (AWS S3), NumPy, Pandas, Ensemble Methods, Predictive Modeling, Predictive Analytics, Time Series, Time Series Analysis, Regression Modeling, Classification, Regression, Statistical Analysis, Version Control, Git, Statistics, Jupyter, Amazon EC2, AWS CloudTrail, Amazon QuickSight, Boto 3, Research, Technical Writing, Amazon Athena, Statistical Modeling, Data Analytics, Predictive Learning, Linear Regression, Docker, Big Data, Data Reporting, Amazon Web Services (AWS), Amazon Machine Learning
  • Teaching Assistant

    2016 - 2019
    Royal Holloway
    • Assisted with CS Data Analysis, Machine Learning, and Data Visualization MSc courses.
    • Marked the assignments and prepared the coursework materials.
    • Provided supervision for final-year MSc students. Helped students with issues on their dissertations.
    Technologies: Data Analysis, Machine Learning, Data Visualization
  • Data Scientist

    2016 - 2018
    Lindgren Laboratories Limited
    • Developed models for the prediction of outcomes of football matches in online mode using statistical and machine learning methods in R and Python.
    • Improved the existing model and state-of-the-art methods, which increased revenue by several percentage points.
    • Managed one of the team members by guiding his work and supervising project timelines.
    Technologies: Data Science, Databases, Machine Learning, Deep Learning, SQL, Python 3, R, Python, Data Analysis, Data Visualization, Matplotlib, Visualization, Scikit-learn, Real-time Data, Programming, NumPy, Pandas, Ensemble Methods, Predictive Modeling, Predictive Analytics, Time Series, Time Series Analysis, Regression Modeling, Classification, TensorFlow, Regression, Statistical Analysis, Version Control, Git, Statistics, Jupyter, Research, Statistical Modeling, Data Analytics, Predictive Learning, Sports, Linear Regression, Microsoft Excel, Spreadsheets, PostgreSQL, XGBoost, Big Data, Data Reporting, Tableau, MySQL, ETL
  • Chief Risk Specialist

    2014 - 2015
    Promsvyazbank
    • Developed new methods for collection risk models that improved the bank collection strategies and decreased loan default rates.
    • Helped increase collaboration between the risk and the collection departments, leading to an improved risk-based collection strategy.
    • Performed mathematical and financial analyses for the risk committee and senior management, affecting the bank's future policies.
    Technologies: Credit Risk, Databases, SAS, Risk Models, SQL, Data Analysis, Real-time Data, Programming, Finance, Risk, Data Visualization, Predictive Modeling, Predictive Analytics, Time Series Analysis, Regression Modeling, Classification, Economics, Financial Data, Regression, Statistical Analysis, Version Control, Statistics, Statistical Modeling, Data Analytics, Predictive Learning, Linear Regression, Microsoft Excel, Spreadsheets, Software Engineering, PostgreSQL, Data Reporting, MySQL, ETL, Financial Modeling
  • Risk Analyst

    2012 - 2014
    National Bank TRUST
    • Applied various data science methods to improve the bank's marketing campaigns through efficient client segmentation.
    • Developed credit and fraud detection scoring models, decreasing the default and fraud rates on the bank's loans.
    • Delivered various analytics reports on the financial situation at the time, which helped to guide the department's strategies.
    Technologies: Credit Risk, Databases, Risk Models, SQL, SAS, Data Science, Data Analysis, Real-time Data, Programming, Finance, Risk, Data Visualization, Predictive Modeling, Predictive Analytics, Time Series Analysis, Regression Modeling, Classification, Economics, Financial Data, Regression, Statistical Analysis, Version Control, Statistics, Statistical Modeling, Data Analytics, Predictive Learning, Linear Regression, Microsoft Excel, Spreadsheets, Software Engineering, PostgreSQL, Data Reporting, MySQL, Financial Modeling

Experience

  • Competitive Online Algorithms for Probabilistic Prediction
    https://www.researchgate.net/profile/Raisa_Dzhamtyrova/research

    Most of my research was devoted to developing adaptive ensembles of machine learning models in real time. An important property of these ensembles is that at any time in the future, their performance will be close to the performance of the best model in this ensemble. These ensembles are highly adaptive to the newly arrived data, which is particularly important in the real-time setting. My research was published in journals such as Machine Learning, Data Mining and Knowledge Discovery, and Neurocomputing.

  • Pandas coverage application
    https://pandas-coverage.herokuapp.com/

    Application to monitor tests coverage in pandas that is used by its development team. The application is developed using Streamlit and is hosted on Heroku. The data is saved and downloaded from AWS S3.

  • Real-time Anomaly Detection
    https://github.com/alan-turing-institute/anomaly_with_experts

    The increasing connectivity of data and cyber-physical systems has resulted in a growing number of cyber-attacks. Real-time detection of such attacks, through the identification of anomalous activity, is required so that mitigation and contingent actions can be effectively and rapidly deployed.

    I created a new approach for aggregating unsupervised anomaly detection algorithms, which is to be used by digital identity provider companies. I developed the prototype in Python. The preprint is available at arxiv.org/pdf/2010.03857.pdf

  • Deploying a Machine Learning Model on Heroku with FastAPI
    https://github.com/raisadz/deployment_project

    The project aims to deploy an ML classification model on Heroku using FastAPI. Data Version Control (DVC) on AWS S3 is used for data versioning. API tests and unit tests to monitor the model performance on various data slices were implemented and incorporated into a CI/CD framework using GitHub Actions.

  • Building ML Pipeline for Short-term Rental Prices in NYC
    https://github.com/raisadz/build-ml-pipeline-for-short-term-rental-prices

    The project aims to build a reproducible ML pipeline for estimating a property rental price using MLflow and Weights & Biases. New data comes every week, and the model needs regular retraining. An end-to-end reusable pipeline will enable an easy retraining process and reduce the time-to-production.

  • Dynamic Risk Assessment System
    https://github.com/raisadz/model_diagnostics

    The goal of the project is to set up processes and scripts to retrain, redeploy, monitor, and report on the ML model that estimates the attrition risk of a company. The project implements automatic data ingestion, training, scoring, deploying, and diagnostics.

  • Dynamic Cyber Risk Estimation
    https://github.com/alan-turing-institute/dynamic_cyber_risk

    I developed a new approach for dynamic cyber risk estimation. This new method assesses the maximum number of hacking attempts with the desired confidence. I designed the prototype in R and published the research in Springer, which can be read at doi.org/10.1007/s10618-021-00814-z.

Skills

  • Languages

    R, SQL, Python, Python 3, SAS
  • Libraries/APIs

    Scikit-learn, Matplotlib, NumPy, Pandas, XGBoost, TensorFlow, REST APIs
  • Tools

    Git, Jupyter, Microsoft Excel, Spreadsheets, StatsModels, Tableau, AWS CloudTrail, Amazon QuickSight, Boto 3, Amazon Athena, Pytest, Cron
  • Paradigms

    Data Science, Anomaly Detection, ETL, Unit Testing
  • Storage

    Databases, PostgreSQL, MySQL, Amazon S3 (AWS S3), Data Pipelines
  • Other

    Machine Learning, Visualization, Credit Risk, Data Analysis, Data Visualization, Ensemble Methods, Predictive Modeling, Predictive Analytics, Time Series, Time Series Analysis, Regression Modeling, Classification, Regression, Statistical Analysis, Version Control, Statistics, Research, Technical Writing, Statistical Modeling, Data Analytics, Predictive Learning, Linear Regression, Big Data, Data Reporting, Financial Modeling, Real-time Data, Risk Models, Mathematics, Physics, Programming, Finance, Risk, Economics, Financial Data, Sports, Software Engineering, Amazon Machine Learning, Deep Learning, MLflow, Weights&Biases, CI/CD Pipelines, GitHub Actions, FastAPI, DvC, Machine Learning Operations (MLOps), Streamlit
  • Platforms

    Docker, Amazon Web Services (AWS), Amazon EC2, Heroku
  • Frameworks

    Flask

Education

  • PhD in Machine Learning
    2016 - 2020
    Royal Holloway, University of London - Egham, United Kingdom
  • Master's Degree (Outstanding Thesis Award) in Computational Finance
    2015 - 2016
    Royal Holloway, University of London - Egham, United Kingdom
  • Master's Degree in Applied Mathematics and Physics
    2011 - 2013
    Moscow Institute of Physics and Technology - Moscow, Russia
  • Bachelor's Degree in Applied Mathematics and Physics
    2007 - 2011
    Moscow Institute of Physics and Technology - Moscow, Russia

Certifications

  • Machine Learning DevOps Engineer
    NOVEMBER 2022 - PRESENT
    Udacity

To view more profiles

Join Toptal
Share it with others