
Anzor Gozalishvili
Verified Expert in Engineering
Machine Learning Developer
Tbilisi, Georgia
Toptal member since November 17, 2021
Anzor is a Senior ML/MLOps Engineer bridging deep research and industrial AI. He specializes in architecting enterprise platforms (Dagster, SageMaker, and W&B) to move R&D from notebooks to production. His expertise spans agentic LLM frameworks (RAG, MCP), drug discovery, and financial modeling. A published researcher in Nature and Springer, Anzor delivers scalable, high-impact solutions in bioinformatics, NLP, and multi-target regression models that drive global revenue.
Portfolio
Experience
- Docker - 4 years
- PyTorch - 3 years
- SpaCy - 3 years
- Amazon EC2 - 3 years
- Amazon S3 (AWS S3) - 3 years
- Machine Learning Operations (MLOps) - 2 years
- Amazon SageMaker - 2 years
- Amazon SageMaker Pipelines - 1 year
Preferred Environment
PyCharm, MacOS, Jupyter Notebook, Docker, PyTorch, Linux, Transformers, Amazon Web Services (AWS), Amazon SageMaker
The most amazing...
...thing I've developed was the core ML part of a shipping document analysis for key information extraction at Holocene GmbH, a German startup.
Work Experience
Senior MLOps Engineer
Self-employed
- Architected an end-to-end platform using Dagster and W&B to manage 100s of drug-discovery training pipelines. Enabled scalable model orchestration and lineage tracking via S3, streamlining complex R&D workflows.
- Built an LLM-powered agent with a chat interface, integrating vector search and custom MCP tools for automated workflows. Streamlined knowledge retrieval across internal systems via RAG and structured database access.
- Architected a SageMaker MLOps platform to migrate HPC workflows. Automated custom image builds and pipeline orchestration, transitioning from notebooks to production AWS pipelines to significantly accelerate development.
- Partnered with bioinformaticians to translate complex NGS and potency model requirements into technical specifications, aligning R&D drug-discovery objectives with engineering execution for specialized workflows.
Senior ML Engineer
KYROS Insights
- Built multi-target regression models predicting user spend and credit redemption. Identified high-intent segments to optimize loyalty credit allocation, directly driving incremental revenue and reducing marketing waste.
- Engineered scalable DL pipelines using Azure, Spark, and PyTorch Lightning for massive time-series data. Improved data throughput and training speed to enable frequent recalibration of risk models for enterprise clients.
- Managed model lifecycle via MLflow for experiment tracking and custom metrics. Developed specialized visualizations to translate complex ML outputs into stakeholder-ready insights, streamlining analysis of program ROI.
- Optimized multi-target regression by designing custom activation functions and modifying gradient calculations. This tailored approach improved convergence and precision across varied regression targets.
NLP Expert | Data Scientist
Pfizer - Manufacturing Operations Solutions
- Analyzed structures of SOPs from various sources and planned the data extraction strategies for paragraphs, tables, charts, formulas, etc.
- Developed a table extraction pipeline using different OCR services and the custom logic on top of it.
- Conducted research on various layout analysis approaches.
NLP Data Scientist
Holocene GmbH
- Developed the document classifier based on textual and visual features of the document.
- Built several ML pipelines: an entity extractor using the LayoutLMv2 model, an entity extractor using Amazon Textract Forms and the entity type classifier on top of it, and a rule-based entity extractor.
- Researched barcode and signature detection and extraction.
Senior Machine Learning Researcher
MaxinAI
- Reproduced the paper of the previous state-of-the-art bitrate ladder prediction available at https://arxiv.org/pdf/2102.04550.pdf.
- Generated new features that improved bitrate ladder prediction performance by 3% compared to the previous state-of-the-art.
- Ran experiments using deep learning models to extract these new features directly from videos and reduce inference time to meet industry requirements.
Data Scientist
Delivery Hero (Outstaffed from MaxinAI)
- Performed query optimization and migration from AWS Redshift to Google BigQuery that increased readability and established better query run time.
- Incorporated new features into models, resulting in 2% performance improvements on average.
- Moved machine learning pipelines to Airflow DAGs, resulting in a more automated workflow with less human interaction.
- Ran small experiments on ML pipelines using Amazon SageMaker.
Lecturer | Teaching Assistant
MaxinAI
- Lectured on the topic of introduction to machine learning, including workshops.
- Taught natural language processing, including workshops.
- Prepared workshop materials for other lecture sessions.
Lead Machine Learning and NLP Engineer
MaxinAI
- Led a team of four machine learning engineers and successfully managed multiple projects from many clients.
- Developed the NLP part of the project by building a Swiss food regulatory startup product that extracts the full nutritional information from food labels. It sped up employees’ work and decreased human errors.
- Built an intelligent data crawling tool based on classical NLP algorithms that enabled the creation of a database of employees from US-based startups. It facilitated the easy workflow for job candidate recommendations and eliminated manual work.
- Created an innovative, self-improving NLP tool based on the latest DL models and an OCR that extracts essential data points from formal documents. It helped property managers to get more done in less time.
- Analyzed call center chat logs to evaluate operators' efficiency and look for potential issues. Applied named-entity recognition and keyword extraction techniques to analyze the problems related to products from the company's tech forum.
- Developed deep learning NLP models to detect merchant names from bank transaction records. Used LSTM-CRF and CNN models and achieved a 97% F1 score. The US-based startup used the best model to analyze large-scale bank transaction records.
- Experimented with trading bots for the crypto trading market using time series analysis and backtesting, and generating positive income over the day.
- Developed a recommender system for a startup selling second-hand clothes online. Personalized recommendations increased sales and customer satisfaction.
- Built a semantic search engine for retrieving relevant paragraphs from US law cases. It was a fully customizable search engine built on top of keyword extraction and fast vector search capabilities based on BERT embeddings.
Experience
ExtractHD Data Extraction Service
School of AI
https://github.com/MaxinAI/school-of-aiWe started from the basics of ML and maths essentials to advanced techniques. The course consisted of lectures and workshops to help people apply their knowledge in practice while also learning the best practices from industry workers.
Attendees successfully got hired by various tech companies in machine learning positions.
SGS Digicomply LabelWise | Food Label Data Extraction Service
https://www.digicomply.com/label-content-managementI managed the NLP models and also evaluated the entire system.
Eye Color Prediction
I optimized the eye color labeling process using clustering approaches to achieve higher classification accuracy. I also examined DNA methylation values at single-nucleotide polymorphisms (SNPs) to understand the gene expression mechanism.
Published the scientific paper.
Georgian LLM Corpus (ACL Datasets)
https://github.com/AnzorGozalishvili/AnzorGozalishvili.github.io/blob/master/resources/creating_corpus_for_georgian_language_modeling_ACL_ARR_2024_Feb.pdfSyntactic Annotation of Georgian in the UD Schemes (Springer Nature)
Comparative Analysis of Genetic Perturbation Models
Education
Master's Degree in Computer Science
Tbilisi State University - Tbilisi, Georgia
Erasmus Exchange and Scholarship in Computer Science
Universitat Politecnica de Valencia (UPV) - Valencia, Span
Bachelor's Degree in Computer Science
Tbilisi State University - Tbilisi, Georgia
Certifications
Getting Started with AWS Generative AI for Developers
Coursera
Machine Learning Modeling Pipelines in Production
Coursera
Introduction to Machine Learning in Production
Coursera
Machine Learning Data Lifecycle in Production
Coursera
Deep Learning Specialization
Coursera
Sequence Models
Coursera
Convolutional Neural Networks
Coursera
Deep Learning for Business
Coursera
Structuring Machine Learning Projects
Coursera
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, and Optimization
Coursera
Neural Networks and Deep Learning
Coursera
Machine Learning
Coursera
Skills
Libraries/APIs
Pandas, Scikit-learn, NumPy, PyTorch, SpaCy, Matplotlib, Natural Language Toolkit (NLTK), TensorFlow, OpenCV, SciPy, FFmpeg, PIL, Keras, Flask-RESTful, Amazon Rekognition, PyMongo, LSTM, Google Vision API, XGBoost, REST APIs, Hugging Face Transformers, PyTorch Lightning, Spark ML, Stanford NLP
Tools
PyCharm, Docker Compose, Gensim, Amazon SageMaker, AutoML, RabbitMQ, BigQuery, Apache Airflow, Scikit-image, Seaborn, ABBYY, Amazon Textract, Pytest, GitHub, AWS IAM, Terraform, Elastic, ELK (Elastic Stack), Spark SQL, Claude Code, Claude, Amazon Elastic Container Registry (ECR), Amazon Elastic Container Service (ECS), Plotly
Languages
Python, SQL, Markdown
Platforms
Jupyter Notebook, Docker, Linux, Amazon EC2, Amazon Web Services (AWS), Databricks, Azure, AWS Lambda
Frameworks
Flask, LangGraph, Selenium, Scrapy
Paradigms
Agile, Testing, ETL, Model Context Protocol (MCP)
Storage
Amazon S3 (AWS S3), Datadog, Elasticsearch, MongoDB, Cloud Deployment, Data Pipelines, PostgreSQL, Neo4j, Graph Databases
Industry Expertise
Bioinformatics
Other
Natural Language Processing (NLP), Machine Learning, Word2Vec, Data Science, Generative Pre-trained Transformers (GPT), MLflow, Deep Learning, Feature Engineering, Machine Learning Operations (MLOps), Sentiment Analysis, Explainable Artificial Intelligence (XAI), Optical Character Recognition (OCR), Text Detection, Statistics, Artificial Intelligence (AI), Classification, Regression, Amazon SageMaker Pipelines, Generative Artificial Intelligence (GenAI), Amazon Redshift, Pipelines, fastText, Recommendation Systems, Information Retrieval, Active Learning, Variational Autoencoders (VAEs), Time Series Analysis, Principal Component Analysis (PCA), Lecturing, Workshop Facilitation, Computer Vision, Video Encoding, Research, BERT, Data Engineering, Bayesian Statistics, Evaluation, Text Classification, Entity Extraction, Data Analysis, Algorithms, Big Data, CI/CD Pipelines, Containerization, Tesseract, Statistical Data Analysis, Time Series, Neural Networks, Recurrent Neural Networks (RNNs), Data Analytics, Algorithmic Trading, Clustering, Experimental Design, Optimization, Genomics, Biology, Clustering Algorithms, Data Visualization, Hugging Face, Transformers, Text Mining, Authentication, APIs, LayoutLMv2, Parsers, Proof of Concept (POC), Profiling, Model Evaluation, Dagster, LangChain, AI Agents, DataTrove, Web Crawlers, Open-source LLMs, Tokenization, Computational Linguistics, Conference Speaking, Data Labeling, Scanpy, Prediction Markets, Predictive Modeling, AI Tools, Large Language Models (LLMs), Large Language Model Operations (LLMOps), Retrieval-augmented Generation (RAG), Vector Search, FastAPI, Amazon Bedrock, Agentic AI, Azure Databricks, RAG Pipelines, LLM Reasoning, Knowledge Graphs, Data Transformation, AWS Bedrock AgentCore
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring