Anton Wolkov, Developer in San Gwann, Malta
Anton is available for hire
Hire Anton

Anton Wolkov

Verified Expert  in Engineering

Bio

Anton specializes in big data infrastructure architecture and machine learning operations (MLOps). He worked with many high-profile Fortune 500s and startups. He holds a bachelor's degree in computer science, with experience in big data infrastructure development and DevOps. He can get you started with data lakes, onboard data scientists, automation, and create production-grade self-service batch pipelines. Anton prefers using Airflow, Presto, TensorFlow, Kubernetes, and Grafana.

Portfolio

Toptal
Machine Learning Operations (MLOps), Python, Machine Learning, CI/CD Pipelines...
Neobrain
Azure Design, Python, OVH, Architecture, Artificial Intelligence...
Proofpoint
Machine Learning Operations (MLOps), DevOps, AWS, Apache Airflow, AWS Glue...

Experience

Availability

Full-time

Preferred Environment

Ubuntu, Amazon Web Services (AWS), Google Cloud Platform (GCP), MacOS, Databricks, Jupyter Notebook, Zeppelin, IntelliJ IDEA, GitHub

The most amazing...

...thing I've created is a data pipeline infrastructure for six teams worldwide, generating a graph of all internet browsers and mobile phones.

Work Experience

MLOps Engineer

2024 - 2024
Toptal
  • Established a CI/CD deployment pipeline for new AI-enabled apps to a new set of Kubernetes clusters.
  • Integrated with existing networking, GitHub Actions, single sign-on (SSO), and monitoring tools and enabled rapid app development.
  • Backported Dify and n8n for UI-based AI app deployment for less technical users and POCs.
Technologies: Machine Learning Operations (MLOps), Python, Machine Learning, CI/CD Pipelines, Cloud Engineering, AWS, PyTorch, TensorFlow, CatBoost, Docker, Kubernetes, Apache Airflow, LLM, Grafana, Tableau Development, Azure Design, GitOps

DevOps Expert

2023 - 2023
Neobrain
  • Engaged as a DevOps expert for an AI SaaS in development.
  • Created Terraform and Helm infrastructures for a new production Kubernetes cluster in Azure.
  • Provisioned and configured TPU machines for a one-off training session.
  • Installed and configured a Prefect data pipeline with a CI/CD in GitLab and monitored in Prometheus with Grafana.
Technologies: Azure Design, Python, OVH, Architecture, Artificial Intelligence, Artificial Intelligence as a Service (AIaaS), Cost Reduction & Optimization (Cost-down), Prefect, TPU, GPU Computing, Cloud Engineering, Data Build Tool (dbt), Grafana, Prometheus, Kubernetes, Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS), Terraform, Helm, Azure Machine Learning, Machine Learning Operations (MLOps), DevOps, CI/CD Pipelines, Machine Learning, Software Architecture, GPT-4, LLM, GitOps

Lead MLOps,| DevOps | Software Engineer

2020 - 2023
Proofpoint
  • Integrated multiple teams' data into a natural language processing (NLP) oriented batch data pipeline.
  • Used ETL for data exploration and integration tests on anonymized data.
  • Designed microservices architecture using Python, Docker, Helm, and AWS Service Operator.
  • Built a Jenkins-based CI/CD pipeline Kubernetes deployment for the front and back ends.
  • Merged Prometheus and Grafana dashboards from multiple Amazon EKS clusters using Thanos.
  • Automated PagerDuty incident management with playbooks and CI/CD pipeline deployments.
Technologies: Machine Learning Operations (MLOps), DevOps, AWS, Apache Airflow, AWS Glue, Kubernetes, MLflow, Prometheus, TensorFlow, Jenkins, Elasticsearch, AWS, Cloud Infrastructure, Cloud Security, Auto-scaling Cloud Infrastructure, AWS Auto Scaling, Linux, Terraform, Back-end Developers, Ubuntu, System Architecture, Scalability, Data Science, Cloud Engineering, CircleCI, Jupyter Notebook, Zeppelin, Docker, Jenkins Pipeline, Python, Go, Spark, Grafana, MongoDB, AWS, Redis, Jira, Confluence, Kibana, Apache, System Security, Presto, Pandas, Apache Kafka, Data Lakes, Apache Superset, Database, Ansible, Amazon EKS, AWS, Amazon EC2, AWS RDS, Amazon OpenSearch, Amazon S3, PostgreSQL, Hadoop, IntelliJ IDEA, AWS Lambda, AWS, CI/CD Pipelines, AWS, Helm, Google Kubernetes Engine (GKE), Big Data Architecture, Continuous Integration (CI), NoSQL, Relational Databases, ELK Stack, Data Protection, Cost Reduction & Optimization (Cost-down), Artifactory, Bash, Microservices Development, Amazon Elastic Container Service (ECS), AWS DevOps, DNS, API Gateways, Docker Hub, AWS CLI, Amazon Virtual Private Cloud (VPC), AWS, Serverless, Identity & Access Management (IAM), Automated Testing, REST API, Data Science, Snowflake, Data Engineering, Artificial Intelligence, Machine Learning, Big Data Architecture, Shell Script, SQL, Architecture, Cloud Architecture, Google Cloud Development, Git, DevSecOps, Kubernetes Operations (kOps), JavaScript, Data Visualization, Data Science, Cloud Engineering, Data Modeling, Algorithms, Mathematics, Data, Mathematical Analysis, Data Analysis, Data Science, API Integration, Postman, Data Integration, Performance Optimization, Cost Management, Database, Database, DevOps, Personally Identifiable Information (PII), Infrastructure as Code (IaC), Site Reliability, Infrastructure, Amazon Aurora, Argo CD, Sentry, Sonarqube, APIs, Proxies, Developer Portals, Microservices Architecture, GPU Computing, NVIDIA TensorRT, Agile Development, Containers, Argo Workflow, GraphQL, React.js, TypeScript, IT Support, Software Architecture, GitOps

Principal Software Engineer

2018 - 2020
Oracle
  • Created a self-service process for data scientists to productize their data proof of concepts (POCs).
  • Integrated metrics collection and reporting into all parts of the pipeline and GitHub pull requests.
  • Migrated AWS EMR workloads using Spark and Kubernetes running on OCI infrastructure.
  • Developed lightweight microservices to handle real-time pixel requests with strict service level agreements (SLAs).
  • Extended the Python and Amazon S3 (AWS S3) library to support Oracle Cloud. Optimized for high-latency operations.
Technologies: Python, Scala, Go, Luigi, Presto, Apache Pig, Spark, Pandas, Jenkins, Apache Airflow, MLflow, Kibana, Amazon S3, Qubole, Kubernetes, Hadoop, Aerospike, Grafana, Prometheus, Apache Kafka, Machine Learning Operations (MLOps), Elasticsearch, AWS, Cloud Infrastructure, Cloud Security, Auto-scaling Cloud Infrastructure, AWS Auto Scaling, Linux, Terraform, Back-end Developers, Ubuntu, System Architecture, Oracle Development, Scalability, Data Science, Data Lakes, ETL Tools, ETL Testing, ETL, Tableau Development, Business Intelligence Development, Apache Superset, Zeppelin, IntelliJ IDEA, Content Delivery Networks (CDN), AWS Glue, AWS ELB, AWS IAM, Oracle Cloud Infrastructure (OCI), TensorFlow, AWS, Jira, Confluence, DevOps, Amazon EMR, Ansible, Amazon EKS, AWS Cloud, Amazon EC2, AWS RDS, AWS, Amazon OpenSearch, Jupyter Notebook, Docker, Database, Redis, Java, Cassandra, Apache, PostgreSQL, AWS Lambda, AWS, CI/CD Pipelines, AWS, Helm, Big Data Architecture, Continuous Integration (CI), Jenkins Pipeline, NoSQL, Relational Databases, ELK Stack, Data Protection, Cost Reduction & Optimization (Cost-down), Artifactory, Bash, Microservices Development, AWS DevOps, DNS, API Gateways, AWS CLI, Amazon Virtual Private Cloud (VPC), Serverless, Automated Testing, REST API, Data Science, Prefect, Snowflake, Data Engineering, Artificial Intelligence, Machine Learning, Big Data Architecture, Shell Script, MySQL, SQL, Architecture, Cloud Architecture, Migration Engineering, Cloud Migration, Git, DevSecOps, Kubernetes Operations (kOps), Data Visualization, Data Science, Cloud Engineering, Data Modeling, Algorithms, Mathematics, Data, Mathematical Analysis, Data Analysis, Data Science, API Integration, Postman, Data Integration, Big Data Architecture, Ads, Performance Optimization, Cost Management, Database, Database, DevOps, Personally Identifiable Information (PII), Infrastructure as Code (IaC), Site Reliability, Infrastructure, Redshift, Sonarqube, Gradle, Harbor, APIs, Proxies, Developer Portals, Microservices Architecture, Spark, Agile Development, Containers, PySpark, Dagster, Software Architecture, Data Migration, Hadoop, EMR, Bitbucket, C, FastAPI

Software Engineer II

2017 - 2018
Amazon.com
  • Onboarded a new real-time database to sync annotators' inputs. Used JavaScript, ETL, report generator, and data exploration tools for AI experiments and proof of concepts (POCs).
  • Repurposed an internal voice annotation platform to be used for computer vision.
  • Automated status reporting from an experiment management platform to Confluence.
  • Created a Jira ticket templating system to simplify operational process status tracking.
Technologies: Spark, Apache, PostgreSQL, Amazon S3, Jupyter Notebook, Zeppelin, RethinkDB, Java, Python, Amazon EMR, System Security, PyTorch, Machine Learning Operations (MLOps), Elasticsearch, AWS, Cloud Infrastructure, Cloud Security, Auto-scaling Cloud Infrastructure, AWS Auto Scaling, Linux, Back-end Developers, System Architecture, Scalability, Data Science, Data Lakes, ETL, ETL Tools, Database, Apache Superset, Business Intelligence Development, AWS, Apache Kafka, AWS, Computer Vision, Computer Vision Algorithms, Generative Pre-trained Transformers (GPT), NLP, Jenkins Pipeline, Tableau Development, Ubuntu, AWS IAM, AWS ELB, ETL Testing, Amazon EKS, AWS RDS, AWS, Amazon EC2, Amazon OpenSearch, Docker, Jenkins, Redis, Jira, Confluence, Kibana, Pandas, DevOps, IntelliJ IDEA, AWS Lambda, CI/CD Pipelines, Big Data Architecture, Continuous Integration (CI), NoSQL, Relational Databases, ELK Stack, Data Protection, Cost Reduction & Optimization (Cost-down), Bash, Microservices Development, AWS DevOps, AWS CLI, Serverless, REST API, Data Science, Data Engineering, Artificial Intelligence, Machine Learning, Big Data Architecture, Shell Script, SQL, Architecture, Cloud Architecture, Migration Engineering, Cloud Migration, Git, HTML, JavaScript, Data Visualization, Data Science, R, Cloud Engineering, Data Modeling, Algorithms, Mathematics, Data, Mathematical Analysis, Data Analysis, Generative Adversarial Networks (GANs), Image Processing, Generative Design, 3D Modeling, Data Science, API Integration, Postman, Data Integration, Big Data Architecture, Performance Optimization, Database, SageMaker, Database, DevOps, Personally Identifiable Information (PII), Site Reliability, Infrastructure, Redshift, Gradle, APIs, Proxies, Developer Portals, Microservices Architecture, Spark, Agile Development, Containers, PySpark, Software Architecture, Data Migration, Hadoop, EMR, JW Player

Software Engineer II

2014 - 2017
Microsoft
  • Integrated users and file APIs from Microsoft Office 365, Google, ServiceNow, Salesforce, and Okta. Used custom asynchronous distributed rate limiter logic.
  • Created a data playground and scale test with automated CI/CD pipelines for data science proof of concepts (POCs). Utilized a huge anonymized production data sample.
  • Integrated data pipelines to Splunk monitoring. Continued with later iterations of Apache Flink, which were integrated into Prometheus and Grafana.
  • Optimized MongoDB and Elasticsearch-based pipelines to scale for all of Microsoft's customers' data from Outlook and SharePoint.
Technologies: Java, Python, Apache, Splunk, Kibana, Jenkins, Azure Design, Elasticsearch, MongoDB, Apache, Cassandra, Amazon S3, Azure, AWS, Cloud Infrastructure, Cloud Security, Auto-scaling Cloud Infrastructure, AWS Auto Scaling, Linux, Terraform, Back-end Developers, Ubuntu, System Architecture, Scalability, Data Science, Scala, ETL, Data Lakes, Database, Apache, ETL Tools, ETL Testing, AWS ELB, Apache Kafka, Anomaly Detection, Jenkins Pipeline, Azure Blobs, IntelliJ IDEA, Chef, RabbitMQ, Amazon EC2, AWS, Amazon OpenSearch, Jupyter Notebook, Docker, Machine Learning Operations (MLOps), Redis, Jira, Confluence, DevOps, AWS Lambda, CI/CD Pipelines, Big Data Architecture, Continuous Integration (CI), NoSQL, ELK Stack, Data Protection, Cost Reduction & Optimization (Cost-down), Bash, Microservices Development, AWS DevOps, AWS CLI, Serverless, Identity & Access Management (IAM), Node.js, Automated Testing, REST API, Data Science, Data Engineering, Artificial Intelligence, Machine Learning, Big Data Architecture, Shell Script, Nagios, Haproxy, SQL, Architecture, Cloud Architecture, Migration Engineering, Cloud Migration, Git, DevSecOps, Data Visualization, Data Science, Cloud Engineering, Data Modeling, Algorithms, Mathematics, Data, Mathematical Analysis, Data Analysis, Data Science, API Integration, Postman, Data Integration, Swagger, Performance Optimization, Database, Database, Personally Identifiable Information (PII), Infrastructure as Code (IaC), Infrastructure, Amazon Aurora, Azure Kubernetes Service (AKS), Gradle, APIs, Proxies, Developer Portals, Microservices Architecture, Agile Development, Business Intelligence Development, Containers, Django, Azure Machine Learning, Software Architecture, Data Migration

Android Automation App

My university project was to create a robust, free, and intuitive automation app for Android similar to Tasker. I included natural language descriptions and sharing capabilities. The app was downloaded over 100,000 times and was translated into eight languages by volunteers.

Hackathon Project

A hackathon project. It scans your email (Gmail, Outlook) for receipts with links and downloads the files from said links. I set it up to upload and attach them to the original message. The project was written in Python and hosted on Google Cloud using Cloudflare and content delivery networks (CDN).
2009 - 2014

Bachelor's Degree in Computer Science

Technion – Israel Institute of Technology - Haifa, Israel

Libraries/APIs

Luigi, Apache, Pandas, TensorFlow, Jenkins Pipeline, REST API, React.js, PySpark, PyTorch, Node.js, CatBoost

Tools

Jenkins, Apache Airflow, Spark, Grafana, Amazon OpenSearch, Jira, Kibana, Amazon EMR, System Security, Qubole, AWS, Terraform, Tableau Development, Business Intelligence Development, IntelliJ IDEA, CircleCI, AWS ELB, AWS IAM, Chef, RabbitMQ, Ansible, Amazon EKS, AWS, AWS, AWS, Helm, ELK Stack, GitHub, Artifactory, Confluence, Amazon Elastic Container Service (ECS), Docker Hub, AWS CLI, Amazon Virtual Private Cloud (VPC), AWS, Nagios, Git, Postman, SageMaker, Azure Kubernetes Service (AKS), Sentry, Sonarqube, Gradle, Business Intelligence Development, Google Kubernetes Engine (GKE), Azure Machine Learning, Bitbucket, Splunk, Apache, AWS Glue, Apache, Prefect, BigQuery, JW Player

Languages

Python, Go, Java, Bash, Snowflake, SQL, HTML, JavaScript, GraphQL, TypeScript, C, Scala, R

Frameworks

Presto, Big Data Architecture, Swagger, Spark, Django, Hadoop

Paradigms

DevOps, ETL, Anomaly Detection, Continuous Integration (CI), Microservices Development, Automated Testing, DevSecOps, Microservices Architecture, Agile Development

Platforms

Cloud Engineering, Jupyter Notebook, Zeppelin, Docker, Kubernetes, Azure Design, Apache Kafka, AWS, Linux, Ubuntu, Azure, Oracle Cloud Infrastructure (OCI), Amazon EC2, AWS Lambda, Harbor, Apache, Android, Apache Pig

Storage

Elasticsearch, MongoDB, Redis, Amazon S3, PostgreSQL, RethinkDB, Hadoop, Auto-scaling Cloud Infrastructure, Oracle Development, Data Lakes, Database, AWS, Azure Blobs, NoSQL, Relational Databases, MySQL, Google Cloud Development, Data Integration, Database, Database, Redshift, Amazon Aurora, Aerospike, Database, OVH

Other

Machine Learning Operations (MLOps), Prometheus, MLflow, Cloudflare, Cloud Infrastructure, Cloud Security, AWS Auto Scaling, Back-end Developers, System Architecture, Scalability, Data Science, ETL Tools, ETL Testing, Apache Superset, Content Delivery Networks (CDN), NLP, AWS RDS, AWS Cloud, CI/CD Pipelines, Big Data Architecture, Data Protection, Cost Reduction & Optimization (Cost-down), AWS DevOps, DNS, API Gateways, Serverless, Identity & Access Management (IAM), Data Science, Data Engineering, Artificial Intelligence, Machine Learning, Big Data Architecture, Shell Script, Haproxy, Architecture, Cloud Architecture, Migration Engineering, Cloud Migration, Kubernetes Operations (kOps), Data Visualization, Data Science, Cloud Engineering, Data Modeling, Algorithms, Mathematics, Data, Mathematical Analysis, Data Analysis, Generative Adversarial Networks (GANs), Image Processing, Generative Design, 3D Modeling, Data Science, API Integration, Ads, Performance Optimization, Cost Management, DevOps, Personally Identifiable Information (PII), Infrastructure as Code (IaC), Site Reliability, Infrastructure, Argo CD, APIs, Proxies, Developer Portals, GPU Computing, NVIDIA TensorRT, Containers, Argo Workflow, IT Support, Software Architecture, GPT-4, Data Migration, EMR, LLM, FastAPI, GitHub Actions, GitOps, Cassandra, Computer Vision, Computer Vision Algorithms, Generative Pre-trained Transformers (GPT), Dagster, Google BigQuery, Artificial Intelligence as a Service (AIaaS), TPU, Data Build Tool (dbt), Multimodal GenAI

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring