Lukas is available for hire

Lukas Sirsinaitis

Verified Expert in Engineering

Artificial Intelligence Developer

Location

Vilnius, Vilnius County, Lithuania

Toptal Member Since

July 24, 2020

With an academic background in finance and healthcare, Lukas excels at solving business problems using machine learning. Lukas' most commonly used tools are Python, SQL, and Spark. He has 5+ years of experience in NLP and recommender systems. He is a developer with multiple certifications, including Google Data Engineer and Azure AI Engineer, capable of working with pipelines in the cloud. His previous experience includes working with IBM Global Business Services and IBM Research.

Portfolio

CultureX Inc.

Machine Learning, Language Models, Deep Neural Networks, AWS Cloud Architecture...

A Leading Publisher of English Language Reference Material

Python, GPT, Generative Pre-trained Transformers (GPT)...

Visibly Works LLC, a subsidiary of Channel Bakers, Inc.

Amazon Web Services (AWS), Elasticsearch, Data Science, Python 3...

Experience

Python - 5 years Pandas - 5 years SQL - 4 years GPT - 3 years Artificial Intelligence (AI) - 3 years Generative Pre-trained Transformers (GPT) - 3 years Natural Language Processing (NLP) - 3 years Apache Spark - 1 year

Availability

Part-time

Preferred Environment

Python, MacOS, Anaconda, Jupyter Notebook, PyCharm

The most amazing...

...thing I've created is a neural network-based system that handles thousands of complex emails every month and heavily reduces the labor burden.

Work Experience

Senior Machine Learning Engineer (via Toptal)

2023 - 2024

CultureX Inc.

Developed an end-to-end MLOps pipeline, which included fine-tuned LLM (780M and 3B Flan-T5 model options). The parallel pipeline facilitated inference on millions of data points by means of many GPUs, AWS Step Functions, a SageMaker training job, and AWS Lambda.
Refactored an XGBoost and SHAP values algorithm from GPU-based to an efficient CPU and EFS-based solution with massively parallel AWS Lambda invocations, enabling > 20x increase in speed, reducing the pipeline's average runtime from 10 minutes to 30 seconds.
Developed an LLM-based classifier as a copilot to the internal human evaluation of models.
Utilized Hugging Face's Optimum library and ONNX Runtime to prepare a quantized open-source large language model (Flan-T5) for deployment to AWS Lambda, enabling massively scalable inference requests.
Fine-tuned OpenAI's GPT models with custom datasets and incorporated models into the main application using OpenAI API, AWS Step Functions, AWS Lambda, and the AWS Cloud Development Kit (TypeScript).
Conducted numerous experiments in summarization and retrieval-augmented generation tasks. Utilized models at Amazon Bedrock and used a second-generation AWS Inferentia accelerator for experiments with the LLaMA-2 model.
Developed a scalable information retrieval system for million-row datasets. It included an embarrassingly parallel pipeline with GPU-based embedding generation and upload to PostgreSQL DB using AWS Step Functions, SageMaker, and Amazon S3.
Built the information retrieval system in IaC format (TypeScript and CDK), enabling rapid deployment in minutes.
Developed a hybrid, low-latency system designed for querying large datasets. The solution efficiently caches results by leveraging a combination of Amazon DynamoDB, DuckDB, Amazon Elastic File System (EFS), and Amazon Athena.

Technologies: Machine Learning, Language Models, Deep Neural Networks, AWS Cloud Architecture, TensorFlow, Python, Amazon SageMaker, PyTorch, Natural Language Processing (NLP), AWS Cloud Development Kit (CDK), AWS Trainium, Hugging Face, Generative Pre-trained Transformers (GPT), HPCC Systems, Amazon Web Services (AWS), AWS Lambda, Lambda Functions, AWS Step Functions, Amazon S3 (AWS S3), XGBoost, SHAP, Machine Learning Operations (MLOps), Large Language Models (LLMs), Optimum, Open Neural Network Exchange (ONNX), Flan-T5, Llama 2, TypeScript, Amazon Athena, DuckDB, Amazon DynamoDB, OpenAI, AWS Inferentia, Infrastructure as Code (IaC), PostgreSQL, pgvector, Amazon RDS, Relational Database Services (RDS)

Machine Learning Engineer

2022 - 2023

A Leading Publisher of English Language Reference Material

Spearheaded a project, as the primary machine learning engineer, alongside an intern, where we successfully implemented two innovative language models that generated novel dictionary entries and ranked existing dictionary data.
Used PyTorch, fastText, NLTK, spaCy, and other Python libraries to develop generative and ranking algorithms that employed large language models, word vectors, pre-trained models for toxicity filtering, spell-checking tools, and rule-based filtering.
Increased the speed of the final algorithm using Redis cache, accessed terabytes of public and private data stored in MongoDB and Amazon S3, and preprocessed using powerful AWS EC2 instances.
Established a comprehensive MLOps pipeline hosted on an EC2 instance, which incorporated data retrieval from MongoDB, algorithmic data transformations using Python, and extensive data validation of the model output.
Refined, iteratively, the algorithm based on close collaboration with subject matter experts and metrics scored against a sample dataset. Led biweekly meetings with non-technical SMEs, presenting slides with diagrams and algorithm explanations.
Managed, despite working under tight deadlines, and successfully implemented solutions and received excellent feedback after an extensive review by dictionary editors. The outcome is utilized by tens of millions of individuals worldwide.

Technologies: Python, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), JSON, CSV, fastText, BERT, Word2Vec, PyTorch, Generative Systems, SpaCy, Generative Artificial Intelligence (GenAI), OpenAI, Jupyter, Jupyter Notebook, Databases, Natural Language Toolkit (NLTK), ChatGPT, Team Leadership, CI/CD Pipelines, Distributed Systems, Hugging Face

Machine Learning Engineer

2021 - 2022

Visibly Works LLC, a subsidiary of Channel Bakers, Inc.

Guided user feedback and data-driven iterative planning with the CEO of a large eCommerce analytics company based in California. The long-term goal was to optimize over $250 million of the clients' spend using data science and machine learning.
Researched terabytes of eCommerce data using Elasticsearch, MongoDB, and Amazon Athena. Dashboards and charts for stakeholder decision-making were prepared using Google Data Studio, Tableau, Plotly, or Matplotlib.
Unlocked better spending opportunities by building proprietary automated insights. Algorithms were developed in Python, but the data was preprocessed using Amazon Athena or Elasticsearch.
Investigated an early version of Amazon Marketing Cloud containing 300+ features with interaction-level data on millions of users. Contributed to improving data infrastructure by identifying issues in data aggregation from high-traffic sources.
Extracted insights from Amazon Marketing Cloud by developing complex SQL queries with multiple interrelated subquery components in the context of privacy restrictions and limited SQL functionality.

Technologies: Amazon Web Services (AWS), Elasticsearch, Data Science, Python 3, Microsoft Visual Studio, Jupyter Notebook, Anaconda, Documentation, NoSQL, eCommerce, Predictive Analytics, SQL, Plotly, Matplotlib, Tableau, Google Data Studio, Amazon Athena, MongoDB, NumPy, Pandas, Data Analytics, Microsoft Excel, Data Queries, Time Series, JSON, CSV, Cloud, ARIMA, Forecasting, SARIMA, Jupyter, Statistical Analysis, Databases, ETL, eCommerce Analysis, Distributed Systems, Algorithms, Statistical Modeling, Marketing Mix Modeling, Customer Segmentation, Data-driven Marketing, Data Modeling

Machine Learning Engineer

2020 - 2021

Jumprope (acquired by LinkedIn)

Tasked with developing a video and image content recommendation engine, as a sole machine learning engineer, for a social platform similar to Pinterest.
Developed a recommendation engine consisting of a hybrid matrix factorization model, a custom algorithm based on user activity data distribution, and rule-based filters.
Built a custom UDF-based ETL pipeline in Redshift. The pipeline aggregated user behavior data (time spent, views, progress, likes, bookmarks, user polls, impressions) and data on user and item features.
Employed online A/B testing by continuously training multiple ML models to refine the production model towards optimum gradually. The platform eventually grew to 2 million monthly users and was later acquired by LinkedIn.
Implemented a multi-armed bandit testing system that optimized push notification timing for every user.
Developed a proof of concept for the summarization of textual data by using state-of-the-art transformer models.

Technologies: Amazon Web Services (AWS), Anaconda, Artificial Intelligence (AI), Python 3, Machine Learning, Recommendation Systems, ETL, Redshift, User-defined Functions (UDF), A/B Testing, Neural Networks, Data Science, Amazon SageMaker, Deep Learning, Data Engineering, SQL, NumPy, Pandas, APIs, Tableau, Product Analytics, Microsoft Excel, Data Queries, Machine Learning Operations (MLOps), JSON, CSV, Word2Vec, fastText, Cloud, Machine Vision, PyTorch, Jupyter, Jupyter Notebook, Databases, CI/CD Pipelines, Algorithms, Customer Segmentation, Data Modeling, Computer Vision Algorithms

Data Scientist

2018 - 2020

IBM

Used data science to solve various business problems, including human resource department transformation, M&A process transformation, fraud detection, and IT asset commercialization, all supporting revenue and profitability growth.
Made significant contributions to various projects and was chosen as a member of IBM's highly selective special equity program designed to reward IBM's highest contributors.
Led workshops at IBM events with up to 350 participants. The workshops covered Watson Health, natural language processing, the latest cloud advancements for data scientists (AutoAI, petabyte-scale databases, etc.), and cloud certifications.
Collaborated with remote global teams at IBM Global Business Services and IBM Research.
Mentored five interns who then went on to be successful full-time employees at IBM.

Technologies: Data Visualization, Predictive Modeling, Data Analysis, Predictive Analytics, Analytics, MacOS, Linux, Docker, Python, Deep Neural Networks, Machine Learning, PostgreSQL, IBM Cloud, Spark, Keras, TensorFlow, Natural Language Toolkit (NLTK), NumPy, SpaCy, Scikit-learn, Pandas, Data Science, Deep Learning, Computer Vision, BERT, Data Engineering, SQL, APIs, Data Analytics, Microsoft Excel, Data Queries, Kubernetes, Team Leadership, Machine Learning Operations (MLOps), JSON, CSV, Cloud, Neural Networks, Machine Vision, PyTorch, Flask, XGBoost, Distributed Systems, Jupyter Notebook, Jupyter, Databases, NoSQL, fastText, CI/CD Pipelines, Recurrent Neural Networks (RNNs), REST APIs, Algorithms, Statistical Modeling, Data Modeling, Hugging Face

Experience

Complex Email Answering System

An email answering system based on the transformer (DistilBERT) neural network trained on GPU using PyTorch. It handles thousands of emails monthly, which heavily reduces the labor burden.

My Contributions:
• Enabled the system to reach precision levels of over 90% on multiple topics.
• Worked closely with the team from multiple continents to achieve the final result.

Investigative Crime Analysis Tool

An investigative analysis (counter-terrorism, cyber-crime, counter-narcotics) tool for law enforcement agencies.

My contributions:
• Implemented a custom machine learning model (NER, decision trees, and rules) to automate a data import process (file content recognition within XLSX, CSV, TXT) and mapping to a custom schema.
• Used Kafka Event Streams and RabbitMQ for time-sensitive decoupled messaging and cloud-object storage for data retrieval.
• Packaged the application into a Docker container for deployment to Kubernetes.

Custom Recommender System

I built a custom video and image content recommendation engine for a social platform similar to Pinterest.

My Contributions:
• Built an engine that consisted of an ML model (hybrid matrix factorization), a custom algorithm based on users' activity data distribution, and rule-based filters.
• Developed a custom UDF-based ETL pipeline in Redshift that ingested and preprocessed user behavior data (time spent, views, progress, likes, bookmarks, user polls, impressions) and data on user and item features.
• Gradually refined the hyper-parameters of an ML production model towards optimum using continuous online A/B testing.

Commercial Project Classification

Project:
• Our goal was to assist senior management with project investigation by estimating the probability the project belongs to one of the following domains: technology and IT, central support and facilities management, customer interaction and sales, finance and risk, general management, human capital, marketing and experience management, supply, and make and delivery.

My Contributions:
• Overtook the project in the middle of it.
• Iterated through different machine learning algorithms, augmented and preprocessed the data; also implemented a Flask API.
• Used an award-winning XGBoost algorithm to classify commercial projects and managed to increase the accuracy of predictions on the test set.

Creating Customer Segments

My Contributions:
• Applied unsupervised learning techniques on product spending data of customers of a wholesale distributor in Lisbon, Portugal, to identify customer segments hidden in the data.
• Explored correlations between product categories, applied PCA transformations, and implemented clustering algorithms to segment the transformed customer data.
• Provided insights and ways this information could assist the wholesale distributor with future service changes.

Blended ChatGPT with Warren Buffett's Investment Wisdom

Developed a prototype of a LangChain-based application that integrates ChatGPT with the valuable insights present in Warren Buffett's Letters to Shareholders (spanning from 1978 to 2022). This allows users to receive answers to investment-related questions using information derived from these letters.

Image Caption Generation Model

Implemented a generative model based on CNN-RNN architecture outlined in a research paper written by Google scientists. The model integrated concepts from computer vision and machine translation to produce coherent sentences describing images. The model was trained to maximize the likelihood of the target description sentence given the training image. Preliminary results indicated that even after a relatively small amount of training, the model was able to produce pretty reasonable results.

Education

2017 - 2018

MSc Double Degree in Finance (Thesis in Machine Learning)

Norwegian BI Business School - Oslo, Norway

2016 - 2017

MSc Double Degree in Finance

ISM - Vilnius, Lithuania

2010 - 2016

MD Degree in Medical Science

Vilnius University - Vilnius, Lithuania

CFA Institute

Skills

Libraries/APIs

Scikit-learn, Keras, TensorFlow, Natural Language Toolkit (NLTK), SpaCy, Pandas, NumPy, PyTorch, LSTM, OpenCV, REST APIs, Matplotlib, XGBoost

Tools

Jupyter, ChatGPT, Gensim, BigQuery, PyCharm, Tableau, Microsoft Excel, You Only Look Once (YOLO), Hidden Markov Model, Microsoft Visual Studio, Plotly, Amazon Athena, Amazon SageMaker, Spark SQL, OpenAI Gym, AWS Cloud Development Kit (CDK), AWS Step Functions, AWS Inferentia

Frameworks

Apache Spark, Spark, Flask

Languages

Python, SQL, Python 3, R, TypeScript

Paradigms

Data Science, MapReduce, ETL

Platforms

Jupyter Notebook, MacOS, Google Cloud Platform (GCP), Docker, Amazon Web Services (AWS), RStudio, Azure, Linux, Windows, Anaconda, Kubernetes, AWS Lambda

Storage

Databases, Data Pipelines, Elasticsearch, JSON, Redshift, NoSQL, Google Cloud, PostgreSQL, MongoDB, Amazon S3 (AWS S3), Amazon DynamoDB

Industry Expertise

Healthcare

Other

Machine Learning, Statistics, Natural Language Processing (NLP), Deep Neural Networks, Word2Vec, Artificial Intelligence (AI), Data Analytics, Predictive Analytics, Analytics, Statistical Analysis, Predictive Modeling, Recommendation Systems, Neural Networks, Deep Learning, BERT, Data Queries, Time Series, CSV, fastText, Convolutional Neural Networks (CNN), General Medicine, ARIMA, SARIMA, OpenAI, GPT, Generative Pre-trained Transformers (GPT), LangChain, OpenAI GPT-3 API, Language Models, Recurrent Neural Networks (RNNs), Algorithms, Statistical Modeling, Data Modeling, Microsoft Azure, Medicine, IBM Cloud, Data Analysis, User-defined Functions (UDF), eCommerce Analysis, eCommerce, Google Data Studio, Computer Vision, Data Engineering, APIs, Machine Learning Operations (MLOps), Azure Data Factory, Cloud, Machine Vision, Object Recognition, CI/CD Pipelines, Pharmaceuticals, Distributed Systems, Generative Systems, Forecasting, Image to Text, Image Recognition, Marketing Mix Modeling, Customer Segmentation, Data-driven Marketing, Computer Vision Algorithms, Mathematics, Finance, Data Visualization, A/B Testing, Documentation, Product Analytics, Team Leadership, Kalman Filtering, Generative Artificial Intelligence (GenAI), Generative Adversarial Networks (GANs), Hugging Face, AWS Cloud Architecture, AWS Trainium, HPCC Systems, Lambda Functions, SHAP, Large Language Models (LLMs), Optimum, Open Neural Network Exchange (ONNX), Flan-T5, Llama 2, DuckDB, Infrastructure as Code (IaC), pgvector, Amazon RDS, Relational Database Services (RDS), AI Content Creation

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring