Sultan Orazbayev, Developer in Almaty, Almaty Region, Kazakhstan
Sultan is available for hire
Hire Sultan

Sultan Orazbayev

Verified Expert  in Engineering

Data Scientist and Developer

Almaty, Almaty Region, Kazakhstan

Toptal member since May 10, 2022

Bio

Sultan is a data scientist with training in social sciences. He is experienced in describing key trends and patterns in data (structured/unstructured) and answering questions related to international economics, social networks, migration, and innovation. Sultan uses Python (especially Pandas, NumPy, and Dask) to deliver practical, real-world tools for government, finance, R&D, and education clients.

Portfolio

Self-employed
Python, NumPy, STATA, Pandas, Bash, Git, Graph API, Dask, Big Data, GitHub...
Harvard University
Python, Dask, Bash, Pandas, NumPy, SciPy, Scikit-learn, STATA, Data Analysis...
Private Endowment Fund
STATA, Microsoft Excel, Bloomberg, Data Analysis, Data Analytics, Econometrics...

Experience

  • STATA - 15 years
  • Dask - 6 years
  • Scikit-learn - 6 years
  • Pandas - 6 years
  • Data Science - 6 years
  • Python - 6 years
  • Machine Learning - 5 years
  • NetworkX - 4 years

Availability

Full-time

Preferred Environment

MacOS, Python, Conda, Linux, Jupyter, Jupyter Notebook

The most amazing...

...project I’ve worked on was record linkage and analysis of hundreds of millions of individuals that lived in the United States in the 19th and 20th centuries.

Work Experience

Freelancer

2022 - PRESENT
Self-employed
  • Developed custom computational workflows with interactive visualization using Python and scientific computing libraries for clients in manufacturing and B2B sales.
  • Optimized an existing Python-based computation to achieve 100x faster computation speed and allow the work to be distributed across multiple computers.
  • Performed extensive data cleaning on a terabyte-scale unstructured dataset.
Technologies: Python, NumPy, STATA, Pandas, Bash, Git, Graph API, Dask, Big Data, GitHub, Bokeh, Matplotlib, NetworkX, Data Visualization, DataFrames, Snakemake, Apache Airflow, Statistical Analysis, Amazon Web Services (AWS), Data, Data Science, Record Linkage, Machine Learning, Deduplication, Jupyter Notebook, XGBoost, GeoPandas, GIS, Maps, Docker, Pattern Matching, Large-scale Projects, Algorithms, Applied Mathematics, Data Modeling, Generative Artificial Intelligence (GenAI), Regression Modeling, Large Language Models (LLMs)

Postdoctoral Fellow (Center for International Development)

2018 - 2022
Harvard University
  • Contributed to the ongoing academic research projects at the center, primarily in terms of data engineering and data analysis.
  • Developed and taught advanced data processing techniques to peers and junior colleagues (workshops, guest lectures, and seminars).
  • Developed custom Python-based workflows (Snakemake) for reproducible analysis and processing of large-scale dumps of unstructured information (images, text).
Technologies: Python, Dask, Bash, Pandas, NumPy, SciPy, Scikit-learn, STATA, Data Analysis, Data Analytics, Data Science, Statistics, Big Data, Data Engineering, Prefect, Econometrics, Microeconomics, Economic Analysis, Computational Economics, Machine Learning, NetworkX, GitHub, Git, SQL, SQLite, Web Scraping, Record Linkage, Neo4j, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Python 3, Jupiter, Data Pipelines, ETL, JSON, Databases, Data Visualization, DataFrames, Snakemake, Statistical Analysis, Data, Deduplication, Jupyter Notebook, XGBoost, GeoPandas, GIS, Maps, Docker, Pattern Matching, Large-scale Projects, Algorithms, Applied Mathematics, Data Modeling, Regression Modeling

Economist

2017 - 2019
Private Endowment Fund
  • Contributed to developing an analytical framework for asset allocation at a new private endowment fund.
  • Produced weekly and monthly economic analyses for the investment committee.
  • Analyzed financial and economic data to monitor macroeconomic trends in the domestic and international economy.
Technologies: STATA, Microsoft Excel, Bloomberg, Data Analysis, Data Analytics, Econometrics, Macroeconomic Forecasting, Microeconomics, Economic Analysis, Web Scraping, Data Pipelines, ETL, Data Visualization, Forecasting, DataFrames, Statistical Analysis, Data, Pattern Matching, Applied Mathematics

Teaching Fellow

2010 - 2016
University College London
  • Taught a year-long introductory economics sequence (micro/macro) to a small class of pre-undergraduate students (25-30 students per year).
  • Contributed to course management and administration, including student assessment.
  • Supported the subsequent placement of students at leading undergraduate programs in Economics.
Technologies: STATA, Data Analysis, Data Analytics, Microeconomics, Pattern Matching

Economist

2007 - 2010
Applied Research Center
  • Published applied economic research jointly with international collaborators (Ifo Institute, Germany).
  • Produced weekly and monthly analytical notes on macroeconomics and finance for senior policymakers.
  • Developed economic models for forecasting and nowcasting of the domestic economy.
Technologies: Bash, Statistics, STATA, Bloomberg, Bloomberg Terminal, Microsoft Excel, EViews, Macroeconomics, Data Analysis, Data Analytics, Econometrics, Macroeconomic Forecasting, Microeconomics, Economic Analysis, Computational Economics, Web Scraping, Forecasting, Data, Regression Modeling

Large-scale Record Linkage of Noisy Data

This project used historical census data on hundreds of millions of individuals that lived in the United States during the 19th and early 20th century. The main challenges involved the scale of computations (resolved by developing custom pipelines with Dask) and noise in the data (custom algorithms involving Pandas/SciPy/NumPy).

Custom Astronomical Observation Planning Tool

A Python-based application for custom astronomical observation planning. The application allowed the performing of desired computations and visualizing key results. The application was built using Panel, HoloViews, astroplan, Pandas, NumPy, and Joblib.

Classifying Sequences of Events

The goal of this project was to classify a semi-structured sequence of events as an input in downstream analytics. Given the context and the intended use case, a weak supervision approach was used to label and group the activities. This project used Python, Snorkel, HoloViews, Pandas, and scikit-learn.
2012 - 2016

PhD in Economics

University College London - London, UK

2006 - 2007

Master's Degree in Economics

London School of Economics and Political Science - London, UK

2002 - 2005

Bachelor's Degree in Economics

Simon Fraser University - Burnaby, BC, Canada

JANUARY 2023 - PRESENT

NVIDIA DLI Certificate: Accelerating End-to-End Data Science Workflows

NVIDIA Deep Learning Institute

Libraries/APIs

Dask, Pandas, NumPy, SciPy, Scikit-learn, NetworkX, XGBoost, Joblib, Graph API, Matplotlib, HoloViews, Astropy

Tools

Snakemake, Jupyter, STATA, Microsoft Excel, Git, GitHub, Apache Airflow, Bloomberg, EViews, MATLAB, Prefect, GIS

Platforms

Jupyter Notebook, MacOS, Linux, Bloomberg Terminal, Amazon Web Services (AWS), Docker

Languages

Python, Bash, Python 3, SQL

Paradigms

ETL

Storage

Data Pipelines, JSON, Databases, Neo4j, SQLite

Other

Record Linkage, Economics, Data Visualization, Data Matching, Pattern Matching, Deduplication, Data, Large-scale Projects, Conda, Data Science, Machine Learning, Econometrics, Data Analysis, Data Analytics, Big Data, Data Engineering, Macroeconomic Forecasting, Microeconomics, Economic Analysis, Computational Economics, Jupiter, Networks, Statistical Analysis, Maps, Algorithms, Statistical Modeling, Applied Mathematics, Data Modeling, Generative Artificial Intelligence (GenAI), Regression Modeling, Large Language Models (LLMs), Statistics, Macroeconomics, Web Scraping, Natural Language Processing (NLP), Forecasting, GPU Computing, Graphs, DataFrames, Bokeh, GeoPandas, Tabulator, Astroplan, Snorkel, Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring