
Alexander Chistyakov
Verified Expert in Engineering
Software Engineer and Developer
Bozeman, MT, United States
Toptal member since December 30, 2025
Alexander is a senior full-stack ML engineer with 5+ years of experience. He has worked in quantitative finance, developing ML-based trading and arbitrage strategies, as well as quantitative sports betting. He has designed, deployed, and built robust data pipelines for GBDTs, regression models, and transformers. With extensive experience in PyTorch, LightGBM, XGBoost, and numerous other models, Alexander is skilled in deploying infrastructure and creating pipelines from scratch on bare Linux.
Portfolio
Experience
- Pandas - 6 years
- Torch - 5 years
- Reinforcement Learning - 5 years
- Backtesting - 5 years
- Probabilistic Graphical Models - 5 years
- XGBoost - 5 years
- Sentiment Analysis - 4 years
- LightGBM - 4 years
Preferred Environment
Vim Text Editor, Linux, LightGBM, PyTorch
The most amazing...
...project I've worked on involved developing ML-based trading bots based on people's descriptions of their trading strategies.
Work Experience
Data Scientist and Software Engineer
Xtrades
- Deployed and monitored Gradient Boosted Decision Tree (GBDT) models using LightGBM for predictive signal scoring, trade filtering, and model-driven decision support.
- Designed and implemented real-time financial data pipelines for market data aggregation, validation, and feature generation used across trading and analytics systems.
- Integrated news sentiment analysis pipelines to enrich trading signals, combining NLP-based sentiment features with price and volume data to improve signal robustness.
- Developed simulation and stress-testing frameworks using variational autoencoders to generate synthetic market scenarios, enabling strategy validation under rare or unseen market conditions.
- Applied statistical validation and monitoring techniques to ensure model stability, drift detection, and performance consistency in live trading environments.
Data Scientist
Propsbot.ai
- Built and maintained data ingestion pipelines using Dagster and Apache Airflow, integrating real-time sports data from Odds API and MySportsFeeds.
- Developed player prop prediction models for NFL and NHL using XGBoost, including model calibration for online inference.
- Implemented Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) to capture player-level latent states and seasonal dynamics, improving predictive performance.
- Designed and executed A/B testing frameworks to evaluate and deploy challenger models in production.
Full-stack ML Engineer
Onebrain Technologies
- Built and deployed Python-based AI models for statistical arbitrage, integrated into live trading systems.
- Deployed production-grade ML pipelines using TensorFlow, Kubernetes, and Apache Airflow, achieving a 40% reduction in signal latency.
- Fine-tuned a news sentiment analysis model using DeBERTa, integrated with a Hierarchical Hidden Markov Model (HHMM) to detect market regimes and dynamically condition trading signals on macro and sentiment states.
- Developed advanced quantitative trading strategies using ensemble learning, Bayesian models, and reinforcement learning for adaptive indexing, regime-aware weighting, portfolio rebalancing, and execution optimization.
- Implemented advanced risk management using Causal Bayesian Networks, Topological Data Analysis, and Extreme Value Theory for systemic, regime, and tail-risk modeling.
- Built high-frequency execution systems with latency optimization, predictive order management, and real-time anomaly detection.
- Designed continuous adaptation frameworks, leveraging online learning, transfer learning, and evolving model ensembles to maintain performance across changing market conditions.
Experience
Regime-aware News Sentiment Engine for Algorithmic Trading
My technical approach included fine-tuning a DeBERTa-based news sentiment model on financial news and market-relevant text to generate high-fidelity sentiment scores. I implemented a Hierarchical Hidden Markov Model (HHMM) to infer latent market regimes at multiple time scales (short-term sentiment shocks versus long-term macro states). I
conditioned trading signals on the joint state of news sentiment embeddings, inferred market regime probabilities, and macro and volatility features. I also integrated regime-aware sentiment features directly into downstream alpha models and execution logic.
Player Prop Prediction and Simulation System for NFL and NHL
https://propsbot.ai/My solution involved developing a hybrid modeling pipeline that captures both observable and latent patterns in player performance. This included XGBoost-based prediction models for individual player props, calibrated for online inference to provide real-time predictions. I also leveraged latent state modeling using Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) to capture player-level hidden traits and seasonal dynamics, improving predictive accuracy and robustness. Additionally, I designed and implemented A/B testing frameworks to evaluate challenger models in production, ensuring continual model improvement and reliable deployment.
AI-driven Quantitative Trading and Risk Management System
Synthetic indices were created using Random Forests, Gradient Boosting, and ensemble learning to generate robust signals. Bayesian Adaptive Factor Models were used for regime-aware stock weighting, dynamically adjusting allocations to changing market conditions. Reinforcement learning was used for portfolio and execution: Model-Based RL (MBRL) for dynamic portfolio rebalancing under risk and liquidity constraints, and Multi-Agent RL (MARL) for coordinated trade execution, reducing market impact and optimizing multi-asset strategies. Regarding advanced risk management, Causal Bayesian Networks were used for systemic risk inference, and Extreme Value Theory (EVT) was leveraged for tail-risk and stress-event modeling, protecting portfolios from rare but catastrophic events.
YourSearch.ai
https://yoursearch.ai/Key features:
• Personalized search: Delivers customized search results based on individual preferences, ensuring a unique experience for each user.
• Unbiased results: Prioritizes accuracy and neutrality, providing users with unbiased and relevant information.
• Summarized content: Summarizes lengthy articles, research, and data, presenting only the most pertinent details for faster consumption.
• Real-time integration: Seamlessly integrates into messaging platforms to allow users to interact with the search engine without leaving their preferred communication tools.
Levity – AI-powered Email Automation for Logistics
https://levity.ai/Key features:
• AI-powered workflows: Automates email classification and processing for quoting, order entry, and tracking.
• Data extraction: Captures critical information from emails and PDFs, including pickup dates, weights, and destinations.
• Seamless integration: Works with TMS platforms (e.g., Rose Rocket) and rate engines (e.g., DAT) for real-time pricing and load creation.
• Customizable responses: Generates replies within 30-60 seconds, with options for drafts or fully autonomous emails.
• Versatility: Handles various input formats but may face challenges with handwritten or cursive text.
Education
Bachelor's Degree in Mathematics
University of Colorado Boulder - Boulder, CO, USA
Skills
Libraries/APIs
PyTorch, XGBoost, TensorFlow, Scikit-learn, Pandas, LSTM, Node.js, Twilio API, vLLM, Keras, React
Tools
Azure OpenAI Service, Amazon SageMaker, AWS Glue, Tekton, Amazon EKS, Amazon Textract, Claude Code, Apache Airflow, Spark SQL, Vim Text Editor, Hidden Markov Model
Languages
Python, TypeScript, Snowflake, JavaScript, SQL
Frameworks
LightGBM, LangGraph, Django, Next.js, Spark
Paradigms
HIPAA Compliance, Model Context Protocol (MCP), ETL, Text Retrieval
Platforms
Databricks, Azure, Amazon Web Services (AWS), Vercel, Docker, Kubernetes, Amazon EC2, Harness, Linux
Storage
PostgreSQL, Amazon S3 (AWS S3), Data Pipelines, Elasticsearch, MySQL
Other
Torch, Transformers, Statistical Arbitrage, Bayesian Machine Learning, Sentiment Analysis, Reinforcement Learning, Variational Autoencoders (VAEs), Topological Data Analysis, Backtesting, Bayesian Statistics, Probabilistic Graphical Models, Simulations, Model Validation, Time Series, A/B Testing, Random Forests, Bayesian Networks, Meta-learning, Artificial Intelligence (AI), Large Language Models (LLMs), Architecture, Back-end, Machine Learning, AI Voice Agents, Conversational AI, Natural Language Processing (NLP), RESTFul APIs, Voice, VoiceBot, RAG Pipelines, Distributed Systems, Multimodal Models, Large-scale Distributed Systems, Retrieval-augmented Generation (RAG), AI Chatbots, Data Protection, Vector Databases, LangChain, OpenAI, Mistral AI, Generative Artificial Intelligence (GenAI), Full-stack, Multi-tenant SaaS, Agentic AI, Amazon Bedrock, Amazon Web Services Agent Core, Data Processing, Data Visualization, Financial Modeling, Pattern Recognition, Predictive Analytics, CI/CD Pipelines, API Integration, Server-side PDF Generation, Agentic RAG Systems, Data Science, Deep Neural Networks (DNNs), RAG Architecture, NLU, Model Deployment, Machine Learning Operations (MLOps), Amazon Kinesis, Optical Character Recognition (OCR), AI Agent Orchestration, Pgvector, Supabase, Light LLMs, Data Analysis, Data Engineering, Data Cleaning, Data Labeling, Data Modeling, ETL Pipelines, Mathematical Modeling, Mathematical Analysis, Diffusion Models, Graph Neural Networks (GNNs), Causal AI, FastAPI, Dagster, Vapi, UiPath, Fine-tuning, Mathematics, Reinforcement Learning from Human Feedback (RLHF)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring