Alex is available for hire

Alex Burlacu

Verified Expert in Engineering

Machine Learning Developer

Location

Chișinău, Chisinau, Moldova

Toptal Member Since

May 24, 2022

Over the years, as an experienced machine learning engineer, Alex dealt with diverse problems, ranging from computer vision to natural language processing and time series forecasting. He has worked as a single engineer on the project several times, and despite scarce data and few computational resources, he succeeded where others failed. He has acted as a machine learning team lead for the past few years. In his spare time, Alex enjoys engaging in independent lecturing and ML research.

Artificial Intelligence (AI)Machine Learning Deep Learning Natural Language Processing (NLP)Python 3 Python Git Scikit-learn Docker REST APIs Computer Vision FastAPI Large Language Models (LLMs)Prompt Engineering Generative AI Coding MLOps MLflow

Portfolio

Undetectable AI

Machine Learning Operations (MLOps), PyTorch, Artificial Intelligence (AI)...

ClearML

Python 3, PyTorch, ClearML, Amazon EC2, AWS CLI, Amazon S3 (AWS S3)...

DevelopmentAid

Agile Software Development, Bash, Python 3, Docker, Docker Compose, PyTorch...

Experience

Python 3 - 7 years Machine Learning - 6 years Scikit-learn - 6 years Docker - 5 years Deep Learning - 5 years PyTorch - 4 years Machine Learning Operations (MLOps) - 4 years Team Mentoring - 3 years

Availability

Part-time

Preferred Environment

Ubuntu, Python 3, Visual Studio Code (VS Code), Git, Docker, PyTorch, Neural Networks

The most amazing...

...thing I've made is an actively-learned multilingual BERT model used in document tagging to identify tender attributes and speed up document processing.

Work Experience

NLP Expert

2023 - 2024

Undetectable AI

Developed a custom self-hosted LLM solution for the project, deployed it using Terraform on AWS ECS, and optimized its serving on EC2 g5 instances using vLLM.
Ran multiple text EDA analyses, including POS tags, semantic coherence, n-grams, and text readability statistics, to identify relevant structural and semantic patterns that we later used to adjust the core product's performance.
Researched non-LLM-based solutions and automatic textual patterns mining from documents for text enhancement as part of an effort for a new iteration of the product,.

Technologies: Machine Learning Operations (MLOps), PyTorch, Artificial Intelligence (AI), Natural Language Processing (NLP), Amazon SageMaker, OpenAI GPT-3 API, Graphics Processing Unit (GPU), Amazon EC2, Amazon Elastic Container Service (Amazon ECS), Large Language Models (LLMs), Terraform, OpenAI, Generative Artificial Intelligence (GenAI), Language Models, Amazon Web Services (AWS), Open-source LLMs

Software Lead

2023 - 2024

ClearML

Led the implementation of the ClearGPT project, a set of no-code tools to train and deploy self-hosted LLMs for enterprises. Actively shaped the project's design and roadmap at different stages, specifically MVP, demo, and customer PoC.
Tuned and deployed multiple LLaMA-based and FLAN-T5 models on AWS G5 instances for optimal price/performance ratio. Also worked with multi-GPU and multi-node training using HuggingFace Accelerate.
Built tools to generate Q&A datasets from documentation pages, a RAG-aware dataset generation pipeline, and a custom trainer that oversamples worst-performing examples to force the model to focus more on improving the performance of hard examples.
Led the ClearML SDK team that developed features and ensured the timely release of both open-source and enterprise versions of packages. I am actively involved in prioritizing and planning features for future releases.
Involved in both community and enterprise support activities. Doing technical onboarding and actively advising clients on how best to leverage ClearML and ClearGPT, given their unique constraints and requirements.

Technologies: Python 3, PyTorch, ClearML, Amazon EC2, AWS CLI, Amazon S3 (AWS S3), Multi-GPU Training, Machine Learning, Large Language Models (LLMs), LLaMA, Flan-T5, Question Generation, Q&A Bots, Generative Pre-trained Transformers (GPT), Docker, Design Patterns, Retrieval-augmented Generation (RAG), Python, Debugging, OpenAI GPT-3 API, Graphics Processing Unit (GPU), Generative Artificial Intelligence (GenAI), Language Models, Prompt Engineering, Fine-tuning, Data Synthesis, Artificial Intelligence (AI), Amazon Web Services (AWS), Generative AI, Open-source LLMs

Machine Learning Team Lead

2019 - 2022

DevelopmentAid

Used machine learning (ML) and deep learning for natural language processing (NLP) on documents to make data entry more efficient.
Developed and produced multiple ML microservices, including one to classify and tag documents through named entity recognition using PyTorch and BERT, and another to deal with an imbalanced multi-output text classification using scikit-learn.
Defined and wrote programs for fast data annotation and synthetic data enrichment for named entity recognition (NER). Increased the dataset size from a handful of well-annotated documents to more than a hundred.
Guided the development of new ML models and implemented practices such as ML code review, cross-validation, and replicable experiments.
Defined some MLOps practices mainly related to model serving using Ray Serve and experiment tracking with MLflow.
Established an observability infrastructure to reduce the number of unreported errors and accelerated bug discovery from a few days to about 10 minutes. Used Jaeger and ELK and helped in the adoption of Prometheus and Grafana.
Defined and documented the deployment process and reduced the time to deploy trained models to less than 10 minutes. Managed a Jenkins instance and used Jenkins pipelines for that.
Established code reviews, periodic one-on-one meetings, explicit coding best practices, and agile processes like iteration planning, planning poker, and standup meetings, reducing feature cycle time by 5x and new bugs per iteration to 0.3.
Led a team of three junior engineers since July 2020 in developing an automated data entry solution, developing and deploying new ML models, and handling our observability and CI infrastructures.

Technologies: Agile Software Development, Bash, Python 3, Docker, Docker Compose, PyTorch, RabbitMQ, Ray, Python, Flask, FastAPI, Jenkins, Jenkins Pipeline, Scikit-learn, Jaeger, Prometheus, Grafana, Machine Learning Operations (MLOps), Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Deep Learning, Transformers, MLflow, Team Mentoring, Machine Learning, Python 2, Artificial Intelligence (AI), JSON, Team Leadership, Pandas, Data Science, Hugging Face, BERT, Neural Networks, REST APIs, Fine-tuning, DevOps, Graphics Processing Unit (GPU), Language Models

Research Intern

2021 - 2021

Universite Sorbonne Paris Nord

Increased sample efficiency of deep learning algorithms, mixing techniques from self-supervised, semi-supervised, and few-shot learning applicable to images and other data sources.
Used Google Colab notebooks to run experiments, then switched to Google Cloud Platform. Provisioned with Terraform and Ansible, creating a graphics processing unit (GPU) worker and a tracking server in a single bash command within one to two minutes.
Used MLFlow for experiment tracking and a combination of Papermill and Optuna for hyperparameter optimization.

Technologies: Self-supervised Learning, Learning, Terraform, Google Cloud, Ansible, Few-shot Learning, MLflow, Hyperparameter Optimization, Optuna, Python 3, Python, Machine Learning, Artificial Intelligence (AI), Computer Vision, Machine Learning Operations (MLOps), PyTorch, Deep Learning, Neural Networks, Fine-tuning, DevOps, Graphics Processing Unit (GPU), Research

University Assistant

2019 - 2021

Technical University of Moldova

Recreated and taught the network programming course and two lab projects focusing on concurrency primitives and networking protocols.
Authored and lectured the real-time programming course and three lab projects covering message-based concurrency, including actor model and CSP, and message-oriented integration patterns and protocols like MQTT and XMPP.
Overhauled and led the distributed systems and network programming courses and labs. Updated the real-time programming course and taught it as well.
Covered diverse topics in the distributed systems course, such as data processing systems, distributed databases, microservice design patterns, and main problems of distributed systems, like the consensus, time, and exactly-once delivery.
Mentored five final-year students for their semester project; two of them chose me as their bachelor thesis supervisor. Led labs for over 40 students per semester.

Technologies: Elixir, Python 3, HTTP, HTTP 2, Transmission Control Protocol (TCP), RabbitMQ, Docker, Actor Model, Erlang, Scala, Message Queues, Parallel Programming, REST, HATEOAS, Mentorship, Distributed Systems, Microservices, Python, REST APIs

Summer Intern

2018 - 2018

Cern

Participated in the EP-SFT group as an associate partner, receiving a grant from the UK Science and Technology Facilities Council (STFC).
Developed a project to benchmark the TMVA package against TensorFlow on event-by-event inference performance targeting multi-layered perceptrons for high-energy physics (HEP).
Searched for the bottlenecks and future directions of optimization for the TMVA subpackage of the ROOT scientific package.
Concluded that, for one-by-one and small batch (< 32) inference modes, TMVA is up to two orders of magnitude faster than TensorFlow 1.8, built from source with AVX512 enabled using a C++ inference API.
Presented a poster about this work at a session at the EEML 2019 Summer School in Bucharest.

Technologies: C++, Bash, Docker, ROOT, TensorFlow, Keras, Bazel, Python 3, Python, Machine Learning, Deep Learning, Neural Networks

Machine Learning Engineer

2017 - 2018

Redox Entertainment

Researched and developed neural networks for medical image analysis of oocytes for IVF. Created over ten bespoke neural network architectures using techniques like pre-training with autoencoders and siamese networks for self-supervised learning.
Mentored and trained a Ph.D. intern for three months who became part of the team, also working on deep learning-related projects.
Developed a specialized architecture for a small-sized, low-variance dataset of medical images with a performance on par with Google's AutoML Vision.
Debugged a pre-processing data issue leaking the test set and wrongfully giving very high accuracy during evaluation. Prevented releasing the broken model, thus saving the company's reputation.

Technologies: Bash, TensorFlow, Docker, Keras, Scikit-image, Computer Vision, Self-supervised Learning, Deep Learning, Machine Learning, Medical Imaging, Team Mentoring, Python 3, Python, OpenCV, Artificial Intelligence (AI), Image Processing, SQL, Data Science, Neural Networks, Fine-tuning, Graphics Processing Unit (GPU), Research

Co-founder and CTO

2017 - 2017

BookVoyager

Developed a search and content-based recommendation system for fiction books that extracts features from raw text and provides recommendations based on those features.
Implemented logging for faster troubleshooting and defined the architecture as a multiservice system.
Built the feature extraction and recommendation sub-systems based on token-level and whole-text analysis with SpaCy.
Participated in customer interviews, defined both business and development processes, and pitched the project at various venues.
Sped up the computation of recommendation results 85x by using a pre-allocated array and used profiling to identify the bottleneck.

Technologies: Python 3, Python, Unit Testing, SpaCy, Scikit-learn, MongoDB, Flask, REST, XML-RPC, Git, Sentiment Analysis, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, JSON, REST APIs

Experience

Serverless Platform

I designed and implemented a serverless platform from scratch, an AWS Lambda using Python, FastAPI, and Docker. With Vue.js, I wrote a rudimentary UI for it. The platform allowed the creation and running of arbitrary Python code within Docker containers.

To enrich its functionality, I added a few other services like RabbitMQ, Minio, PostgresSQL, MongoDB, and Apache Tika. To make it easier to use, I wrote an API Gateway-like service, a TCP server translating HTTP requests into messages and sending the responses back to the caller as HTTP responses.

The project later became the base of an independently taught course on distributed systems design. It was a free course, with 25 students enrolled, 11 of which received certificates of completion.

Alex's Occasional Blog Posts | Personal Blog

https://alexandruburlacu.github.io

In my personal blog, hosted on GitHub, I write mainly about advanced machine learning and deep learning techniques, but also about things that I have at least some non-trivial knowledge of, like distributed systems, elixir language, concurrency and parallelism, and topics relevant for senior engineers and team leads.

I created it using Jekyll, customized some of the templates, and added Google Analytics and Google Tag Manager.

Lightweight MLOps Template for AI Research

A set of tools to speed up the process of running and tracking multiple DL/AI experiments on cloud infrastructure, specifically tuned spot instances to lower the overall cost or iterating on models. To make the tool easy to use with Jupyter, it has custom MLFlow logging. It uses a combination of Papermill and Optuna for running a hyperparameter optimization on Jupyter Notebooks with minimal adjustments.

Moldova's National Python and AI Curriculum

https://mecc.gov.md/sites/default/files/curriculum_ia_aprobat_cnc.pdf

I was the group coordinator developing the national Artificial Intelligence curriculum for high-school students. As part of the effort, we formed the structure of the course to make it engaging for high school students. I created educational materials consisting of fourteen-course notes and exercise notebooks in Romanian. As part of the course, students would learn about the main applications of AI, briefly learn about AI for computer vision and natural language processing, then learn about main machine learning algorithms, then the fundamentals of linear algebra and exploratory data analysis. Finally, various social and non-technical aspects of AI include legal and social issues around self-driving cars, deep fakes, face recognition technologies, and LLMs.

Skills

Languages

Python 3, Python, Elixir, Bash, SQL, C++, C, Python 2, Lisp, HTML, CSS, Java 8, Erlang, Scala

Libraries/APIs

Scikit-learn, REST APIs, PyTorch, TensorFlow, Jenkins Pipeline, Pandas, Keras, Vue, OpenCV, SpaCy

Tools

Git, Docker Compose, RabbitMQ, Jekyll, Google Analytics, Jenkins, Grafana, Scikit-image, Terraform, Ansible, Bazel, Helm, BigQuery, AWS CLI, Amazon Elastic Container Service (Amazon ECS), Amazon SageMaker

Paradigms

REST, Data Science, Functional Programming, DevOps, Unit Testing, Object-oriented Analysis & Design (OOAD), Object-oriented Programming (OOP), Agile Software Development, Serverless Architecture, Parallel Programming, Actor Model, Microservices, Design Patterns

Platforms

Docker, Ubuntu, Amazon Web Services (AWS), Kubernetes, Jupyter Notebook, Google Cloud Platform (GCP), Visual Studio Code (VS Code), ClearML, Amazon EC2

Other

Deep Learning, Machine Learning, Machine Learning Operations (MLOps), Natural Language Processing (NLP), Artificial Intelligence (AI), Neural Networks, GPT, Generative Pre-trained Transformers (GPT), Fine-tuning, Language Models, University Teaching, Team Mentoring, FastAPI, Self-supervised Learning, Learning, Computer Vision, Team Leadership, Hugging Face, BERT, Large Language Models (LLMs), Graphics Processing Unit (GPU), Generative Artificial Intelligence (GenAI), Prompt Engineering, Data Synthesis, Generative AI, OpenAI, Open-source LLMs, Distributed Systems, Cloud Computing, MinIO, Serverless, Transmission Control Protocol (TCP), HTTP, Coding, HATEOAS, Ray, Jaeger, Prometheus, Transformers, MLflow, Medical Imaging, Few-shot Learning, Hyperparameter Optimization, Optuna, ROOT, HTTP 2, Message Queues, Mentorship, Image Processing, Sentiment Analysis, Kustomize, Data Engineering, Multi-GPU Training, LLaMA, Flan-T5, Question Generation, Q&A Bots, Retrieval-augmented Generation (RAG), OpenAI GPT-3 API, Debugging, Research

Frameworks

Flask

Storage

JSON, Google Cloud, MongoDB, XML-RPC, PostgreSQL, Amazon S3 (AWS S3)

Education

2019 - 2021

Master's Degree in Computer Science

Stefan cel Mare University - Suceava, Romania

2019 - 2021

Master's Degree in Computer Science

Technical University of Moldova - Chisinau, Moldova

2015 - 2019

Bachelor's Degree in Computer Science

Technical University of Moldova - Chisinau, Moldova

Certifications

DECEMBER 2022 - DECEMBER 2024

Google Cloud Certified Professional Machine Learning Engineer

Google Cloud

DECEMBER 2022 - DECEMBER 2024

Google Cloud Certified Professional Cloud Architect

Google Cloud

JUNE 2022 - JUNE 2025

Certified Kubernetes Application Developer (CKAD)

The Cloud Native Computing Foundation (CNCF)

JUNE 2022 - PRESENT

Deep Learning Engineer

Workera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring