Alexander Chistyakov, Developer in Bozeman, MT, United States
Alexander is available for hire
Hire Alexander

Alexander Chistyakov

Software Engineer and Developer

Bozeman, MT, United States

Toptal member since December 30, 2025

Bio

Alexander is a senior full-stack ML engineer with 5+ years of experience. He has worked in quantitative finance, developing ML-based trading and arbitrage strategies, as well as quantitative sports betting. He has designed, deployed, and built robust data pipelines for GBDTs, regression models, and transformers. With extensive experience in PyTorch, LightGBM, XGBoost, and numerous other models, Alexander is skilled in deploying infrastructure and creating pipelines from scratch on bare Linux.

Portfolio

Xtrades
Sentiment Analysis, Variational Autoencoders (VAEs), LightGBM, Scikit-learn...
Propsbot.ai
XGBoost, LightGBM, Scikit-learn, Pandas, Torch, Backtesting...
Onebrain Technologies
Python, XGBoost, LightGBM, Torch, TensorFlow, Scikit-learn, Pandas...

Experience

  • Pandas - 6 years
  • Torch - 5 years
  • Reinforcement Learning - 5 years
  • Backtesting - 5 years
  • Probabilistic Graphical Models - 5 years
  • XGBoost - 5 years
  • Sentiment Analysis - 4 years
  • LightGBM - 4 years

Preferred Environment

Vim Text Editor, Linux, LightGBM, PyTorch

The most amazing...

...project I've worked on involved developing ML-based trading bots based on people's descriptions of their trading strategies.

Work Experience

Data Scientist and Software Engineer

2025 - PRESENT
Xtrades
  • Deployed and monitored Gradient Boosted Decision Tree (GBDT) models using LightGBM for predictive signal scoring, trade filtering, and model-driven decision support.
  • Designed and implemented real-time financial data pipelines for market data aggregation, validation, and feature generation used across trading and analytics systems.
  • Integrated news sentiment analysis pipelines to enrich trading signals, combining NLP-based sentiment features with price and volume data to improve signal robustness.
  • Developed simulation and stress-testing frameworks using variational autoencoders to generate synthetic market scenarios, enabling strategy validation under rare or unseen market conditions.
  • Applied statistical validation and monitoring techniques to ensure model stability, drift detection, and performance consistency in live trading environments.
Technologies: Sentiment Analysis, Variational Autoencoders (VAEs), LightGBM, Scikit-learn, Pandas, Torch, Apache Airflow, Transformers, Backtesting, Simulations, Bayesian Statistics, Model Validation, Artificial Intelligence (AI), Large Language Models (LLMs), Architecture, Back-end, Machine Learning, Natural Language Processing (NLP), Distributed Systems, Multimodal Models, vLLM, Retrieval-augmented Generation (RAG), AI Chatbots, Vector Databases, Generative Artificial Intelligence (GenAI), Azure OpenAI Service, Keras, Full-stack, Agentic AI, Amazon Bedrock, Amazon SageMaker, Amazon Web Services Agent Core, Data Processing, Data Visualization, Financial Modeling, Pattern Recognition, Predictive Analytics, PostgreSQL, CI/CD Pipelines, API Integration, Vercel, Docker, Data Science, Deep Neural Networks (DNNs), RAG Architecture, NLU, Model Deployment, Data Pipelines, Kubernetes, Pgvector, Light LLMs, Data Analysis, Data Engineering, Data Cleaning, Data Labeling, Data Modeling, ETL Pipelines

Data Scientist

2025 - 2025
Propsbot.ai
  • Built and maintained data ingestion pipelines using Dagster and Apache Airflow, integrating real-time sports data from Odds API and MySportsFeeds.
  • Developed player prop prediction models for NFL and NHL using XGBoost, including model calibration for online inference.
  • Implemented Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) to capture player-level latent states and seasonal dynamics, improving predictive performance.
  • Designed and executed A/B testing frameworks to evaluate and deploy challenger models in production.
Technologies: XGBoost, LightGBM, Scikit-learn, Pandas, Torch, Backtesting, Bayesian Statistics, Probabilistic Graphical Models, Databricks, Apache Airflow, Dagster, Bayesian Machine Learning, Spark, Spark SQL, Artificial Intelligence (AI), Machine Learning, Node.js, RESTFul APIs, TypeScript, Keras, Multi-tenant SaaS, Data Processing, Pattern Recognition, PostgreSQL, CI/CD Pipelines, Vercel, Snowflake, AWS Glue, JavaScript, Amazon S3 (AWS S3), Next.js, Data Science, ETL, Deep Neural Networks (DNNs), Model Deployment, Machine Learning Operations (MLOps), Data Pipelines, Kubernetes, Tekton, Amazon EC2, Amazon Kinesis, Amazon EKS, Pgvector, Data Analysis, Data Engineering, MySQL, Data Cleaning, SQL, Data Labeling, Data Modeling, ETL Pipelines

Full-stack ML Engineer

2020 - 2024
Onebrain Technologies
  • Built and deployed Python-based AI models for statistical arbitrage, integrated into live trading systems.
  • Deployed production-grade ML pipelines using TensorFlow, Kubernetes, and Apache Airflow, achieving a 40% reduction in signal latency.
  • Fine-tuned a news sentiment analysis model using DeBERTa, integrated with a Hierarchical Hidden Markov Model (HHMM) to detect market regimes and dynamically condition trading signals on macro and sentiment states.
  • Developed advanced quantitative trading strategies using ensemble learning, Bayesian models, and reinforcement learning for adaptive indexing, regime-aware weighting, portfolio rebalancing, and execution optimization.
  • Implemented advanced risk management using Causal Bayesian Networks, Topological Data Analysis, and Extreme Value Theory for systemic, regime, and tail-risk modeling.
  • Built high-frequency execution systems with latency optimization, predictive order management, and real-time anomaly detection.
  • Designed continuous adaptation frameworks, leveraging online learning, transfer learning, and evolving model ensembles to maintain performance across changing market conditions.
Technologies: Python, XGBoost, LightGBM, Torch, TensorFlow, Scikit-learn, Pandas, Transformers, Statistical Arbitrage, Bayesian Machine Learning, Sentiment Analysis, Fine-tuning, Reinforcement Learning, Variational Autoencoders (VAEs), Diffusion Models, Topological Data Analysis, Graph Neural Networks (GNNs), Causal AI, Backtesting, LSTM, FastAPI, Artificial Intelligence (AI), Large Language Models (LLMs), Architecture, Back-end, Machine Learning, AI Voice Agents, Conversational AI, Natural Language Processing (NLP), RESTFul APIs, TypeScript, Voice, VoiceBot, Twilio API, Vapi, Distributed Systems, Multimodal Models, Large-scale Distributed Systems, vLLM, Retrieval-augmented Generation (RAG), AI Chatbots, Data Protection, Vector Databases, LangChain, OpenAI, Mistral AI, HIPAA Compliance, Generative Artificial Intelligence (GenAI), Azure, Keras, Full-stack, Amazon Web Services (AWS), Multi-tenant SaaS, Agentic AI, Amazon Bedrock, Amazon SageMaker, Amazon Web Services Agent Core, Data Processing, Data Visualization, Financial Modeling, Pattern Recognition, Predictive Analytics, PostgreSQL, CI/CD Pipelines, API Integration, Vercel, Model Context Protocol (MCP), Snowflake, JavaScript, Amazon S3 (AWS S3), Docker, Django, Next.js, Data Science, ETL, Deep Neural Networks (DNNs), RAG Architecture, NLU, Model Deployment, Machine Learning Operations (MLOps), Data Pipelines, Kubernetes, Amazon EC2, Amazon EKS, Amazon Textract, Optical Character Recognition (OCR), Pgvector, Supabase, UiPath, Light LLMs, Data Analysis, Data Engineering, MySQL, Data Cleaning, SQL, Data Modeling, ETL Pipelines

Experience

Regime-aware News Sentiment Engine for Algorithmic Trading

I designed and implemented a regime-aware news sentiment system that combines state-of-the-art NLP with probabilistic time-series modeling to condition trading signals on market context dynamically. The system improves signal robustness by adapting sentiment interpretation across different macro and volatility regimes.

My technical approach included fine-tuning a DeBERTa-based news sentiment model on financial news and market-relevant text to generate high-fidelity sentiment scores. I implemented a Hierarchical Hidden Markov Model (HHMM) to infer latent market regimes at multiple time scales (short-term sentiment shocks versus long-term macro states). I
conditioned trading signals on the joint state of news sentiment embeddings, inferred market regime probabilities, and macro and volatility features. I also integrated regime-aware sentiment features directly into downstream alpha models and execution logic.

Player Prop Prediction and Simulation System for NFL and NHL

https://propsbot.ai/
I built a production-ready player prop prediction system for NFL and NHL games, combining machine learning, probabilistic modeling, and experimental evaluation to generate accurate, real-time player performance predictions. Precise prediction of player performance is challenging due to latent player states, seasonal dynamics, and non-stationary performance trends. Simple statistical models often fail to adapt to changing conditions or provide reliable confidence estimates.

My solution involved developing a hybrid modeling pipeline that captures both observable and latent patterns in player performance. This included XGBoost-based prediction models for individual player props, calibrated for online inference to provide real-time predictions. I also leveraged latent state modeling using Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) to capture player-level hidden traits and seasonal dynamics, improving predictive accuracy and robustness. Additionally, I designed and implemented A/B testing frameworks to evaluate challenger models in production, ensuring continual model improvement and reliable deployment.

AI-driven Quantitative Trading and Risk Management System

I developed an end-to-end AI-powered trading and risk management system that combines advanced machine learning, reinforcement learning, and probabilistic modeling to generate adaptive strategies and robust risk assessments for financial markets. This project involved building a multi-layered system integrating strategy generation, portfolio optimization, execution coordination, and risk management.

Synthetic indices were created using Random Forests, Gradient Boosting, and ensemble learning to generate robust signals. Bayesian Adaptive Factor Models were used for regime-aware stock weighting, dynamically adjusting allocations to changing market conditions. Reinforcement learning was used for portfolio and execution: Model-Based RL (MBRL) for dynamic portfolio rebalancing under risk and liquidity constraints, and Multi-Agent RL (MARL) for coordinated trade execution, reducing market impact and optimizing multi-asset strategies. Regarding advanced risk management, Causal Bayesian Networks were used for systemic risk inference, and Extreme Value Theory (EVT) was leveraged for tail-risk and stress-event modeling, protecting portfolios from rare but catastrophic events.

YourSearch.ai

https://yoursearch.ai/
YourSearch.ai is an advanced AI-powered search engine designed to offer personalized, unbiased, and summarized search results in real-time. The platform is integrated directly into messaging applications, allowing users to interact with search technology in a more seamless, natural way. The project focuses on delivering highly relevant search results tailored to the individual’s preferences and needs, enhancing the overall search experience.

Key features:
• Personalized search: Delivers customized search results based on individual preferences, ensuring a unique experience for each user.
• Unbiased results: Prioritizes accuracy and neutrality, providing users with unbiased and relevant information.
• Summarized content: Summarizes lengthy articles, research, and data, presenting only the most pertinent details for faster consumption.
• Real-time integration: Seamlessly integrates into messaging platforms to allow users to interact with the search engine without leaving their preferred communication tools.

Levity – AI-powered Email Automation for Logistics

https://levity.ai/
Levity is transforming logistics operations through AI-powered email automation. By streamlining processes like quoting, order entry, and tracking, Levity enhances efficiency and reduces manual workloads for companies like Shine Logistics Group and Ultraship. Its customizable workflows and seamless integrations with existing tools make it a powerful ally in freight management.

Key features:
• AI-powered workflows: Automates email classification and processing for quoting, order entry, and tracking.
• Data extraction: Captures critical information from emails and PDFs, including pickup dates, weights, and destinations.
• Seamless integration: Works with TMS platforms (e.g., Rose Rocket) and rate engines (e.g., DAT) for real-time pricing and load creation.
• Customizable responses: Generates replies within 30-60 seconds, with options for drafts or fully autonomous emails.
• Versatility: Handles various input formats but may face challenges with handwritten or cursive text.

Education

2018 - 2020

Bachelor's Degree in Mathematics

University of Colorado Boulder - Boulder, CO, USA

Skills

Libraries/APIs

PyTorch, XGBoost, TensorFlow, Scikit-learn, Pandas, LSTM, Node.js, Twilio API, vLLM, Keras, React

Tools

Azure OpenAI Service, Amazon SageMaker, AWS Glue, Tekton, Amazon EKS, Amazon Textract, Claude Code, Apache Airflow, Spark SQL, Vim Text Editor, Hidden Markov Model

Languages

Python, TypeScript, Snowflake, JavaScript, SQL

Frameworks

LightGBM, LangGraph, Django, Next.js, Spark

Paradigms

HIPAA Compliance, Model Context Protocol (MCP), ETL, Text Retrieval

Platforms

Databricks, Azure, Amazon Web Services (AWS), Vercel, Docker, Kubernetes, Amazon EC2, Harness, Linux

Storage

PostgreSQL, Amazon S3 (AWS S3), Data Pipelines, Elasticsearch, MySQL

Other

Torch, Transformers, Statistical Arbitrage, Bayesian Machine Learning, Sentiment Analysis, Reinforcement Learning, Variational Autoencoders (VAEs), Topological Data Analysis, Backtesting, Bayesian Statistics, Probabilistic Graphical Models, Simulations, Model Validation, Time Series, A/B Testing, Random Forests, Bayesian Networks, Meta-learning, Artificial Intelligence (AI), Large Language Models (LLMs), Architecture, Back-end, Machine Learning, AI Voice Agents, Conversational AI, Natural Language Processing (NLP), RESTFul APIs, Voice, VoiceBot, RAG Pipelines, Distributed Systems, Multimodal Models, Large-scale Distributed Systems, Retrieval-augmented Generation (RAG), AI Chatbots, Data Protection, Vector Databases, LangChain, OpenAI, Mistral AI, Generative Artificial Intelligence (GenAI), Full-stack, Multi-tenant SaaS, Agentic AI, Amazon Bedrock, Amazon Web Services Agent Core, Data Processing, Data Visualization, Financial Modeling, Pattern Recognition, Predictive Analytics, CI/CD Pipelines, API Integration, Server-side PDF Generation, Agentic RAG Systems, Data Science, Deep Neural Networks (DNNs), RAG Architecture, NLU, Model Deployment, Machine Learning Operations (MLOps), Amazon Kinesis, Optical Character Recognition (OCR), AI Agent Orchestration, Pgvector, Supabase, Light LLMs, Data Analysis, Data Engineering, Data Cleaning, Data Labeling, Data Modeling, ETL Pipelines, Mathematical Modeling, Mathematical Analysis, Diffusion Models, Graph Neural Networks (GNNs), Causal AI, FastAPI, Dagster, Vapi, UiPath, Fine-tuning, Mathematics, Reinforcement Learning from Human Feedback (RLHF)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring