Arjaan Buijk, Developer in Plymouth, MI, United States
Arjaan is currently unavailable

Arjaan Buijk

Machine Learning Developer

Plymouth, MI, United States

Toptal member since June 4, 2018

Bio

Arjaan is co-founder of onicai (onicai.com). He's an open-source developer and AI/LLM expert with deep experience in DevOps (AWS and Terraform), crypto (especially Internet Computer Protocol), Python web frameworks (Django and FastAPI), machine learning, and data science. He created icpp-pro, a C++ smart contract framework for the Internet Computer with over 74,000 downloads. A committed lifelong learner, Arjaan enjoys opportunities to collaborate on innovative and challenging projects.

Portfolio

onicai LLC
Agentic AI, Python, C++, Motoko, JavaScript, CI/CD Pipelines, GitHub Actions...
Rasa
Python, Rasa.ai, Zendesk, Google Cloud Platform (GCP)...
US-based SaaS Company
Django, Postman, Newman, Docker Compose, Python, APIs, CircleCI, Bitbucket...

Experience

  • Python - 10 years
  • C++ - 10 years
  • Artificial Intelligence (AI) - 6 years
  • Machine Learning - 4 years
  • Open-source LLMs - 3 years
  • llama.cpp - 3 years
  • Agentic AI - 2 years
  • Claude - 1 year

Preferred Environment

Python, C++20, llama.cpp, Django, Motoko, Terraform

The most amazing...

...AI agents I've built run llama.cpp on the Internet Computer Protocol.

Work Experience

Co-founder | CTO

2024 - PRESENT
onicai LLC
  • Built fully on-chain AI agents powered by the Internet Computer Protocol, integrated into multiple applications with a combined market capitalization of several hundred thousand dollars.
  • Designed and developed the open source Python and C++ tooling enabling open source LLMs to run on the Internet Computer Protocol using llama.cpp (65,000 downloads).
  • Guided the launch process of funnai.onicai.com. I wrote several Python utilities to monitor AI agents' status and to track application metrics.
Technologies: Agentic AI, Python, C++, Motoko, JavaScript, CI/CD Pipelines, GitHub Actions, Crypto, llama.cpp, Open-source LLMs, Data Engineering, ETL, Automation, Anthropic, Claude, ChatGPT, Gemini, Full-stack

Customer Success Engineering Manager/Director

2022 - 2024
Rasa
  • Led technical engagements with enterprise clients at Rasa, a leading conversational AI platform.
  • Acted as a trusted technical advisor, helping clients architect, build, and optimize AI-powered virtual assistants using Python and Rasa's NLP/ML framework.
  • Collaborated closely with product and engineering teams to align roadmap with client needs.
Technologies: Python, Rasa.ai, Zendesk, Google Cloud Platform (GCP), Amazon Web Services (AWS), Artificial Intelligence (AI), Large Language Models (LLMs), Retrieval-augmented Generation (RAG), Agentic AI, OpenAI, OpenAI GPT-4 API, Jupyter Notebook, Matplotlib, Google Cloud, CI/CD Pipelines, llama.cpp, Open-source LLMs, Data Engineering, ETL, Automation, ChatGPT, Full-stack

Senior Software Engineer

2021 - 2021
US-based SaaS Company
  • Improved developer productivity by reducing local development set up from two days to one hour. Created a dockerized development environment for a SaaS application consisting of Django, Go, and React.
  • Created detailed wiki pages in Confluence with instructions for using the dockerized development environment and the API-testing framework. Worked with the development team to implement the new tools and improve their workflows.
  • Created an API-testing framework using Postman and Newman. REST APIs are served by Django, and GraphQL APIs are served by Go. The tests run automatically as part of CI/CD workflows with CircleCI and Bitbucket.
Technologies: Django, Postman, Newman, Docker Compose, Python, APIs, CircleCI, Bitbucket, Confluence, Jira, Agile, Jupyter Notebook, Data Engineering, Automation, Full-stack

Solutions Engineer (NLP)

2019 - 2021
Rasa
  • Supported large enterprise customers by implementing and deploying mission-critical chatbots built with Rasa. Deployments use Docker, Docker Compose, Kubernetes, and OpenShift. Infrastructure a combination of on-prem, AWS, GCP, and Azure.
  • Designed and implemented NLU data, dialog stories, rules, forms, and custom actions (Python) for industry-relevant demonstrator chatbots.
  • Extended Rasa Open Source (Python) available at https://github.com/rasaHQ/rasa. This is an open-source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more.
  • Created and taught an online course on advanced deployment techniques with Kubernetes (https://www.udemy.com/course/rasa-advanced-deployment-workshop/).
  • Implemented Python Asyncio in back-end APIs resulting in dramatically improved throughput rates.
  • Created CI/CD pipeline that trains a Rasa bot, builds a custom docker image, stores the artifacts in AWS S3 and AWS ECR, automatically creates an AWS EKS cluster using the eksctl CLI, deploys Rasa with Helm, and Smoke Tests using Python.
Technologies: TensorFlow, Ubuntu, Windows, DevOps, Pandas, GitHub, NumPy, Chatbots, Google Cloud Platform (GCP), Helm, Kubernetes, Docker, Python, Machine Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Amazon Web Services (AWS), CI/CD Pipelines, GitHub Actions, Rasa.ai, CircleCI, Python Asyncio, PostgreSQL, Webhooks, APIs, Agentic AI, Jupyter Notebook, Google Cloud, Data Engineering, ETL, Automation

Freelance Data Scientist

2018 - 2019
University of Colorado Boulder
  • Developed a sequence-based machine learning model in Python using TensorFlow to predict university student application probability based on millions of time-stamped engagements.
  • Developed a clustering logic in Python using Scikit-Learn to group students by engagement behaviors.
  • Built a decision tree model in Python using XGBoost to predict the probability for admitted students to enroll (yield).
Technologies: Ubuntu, Windows, Keras, GitHub, NumPy, Python, Slack, Zoom, Jira, Bitbucket, SQL, MongoDB, Amazon S3 (AWS S3), Jupyter, Scikit-learn, XGBoost, TensorFlow, Machine Learning, Amazon Web Services (AWS), APIs, Jupyter Notebook, Data Engineering, ETL, Data Science, Data Scraping, Automation

Software Engineer

1988 - 2019
MSC Software
  • Developed a finite element and finite volume simulation software in Python, C++, and Fortran.
  • Designed and implemented a desktop application front-end with the Microsoft Foundation Class Library (MFC) and Qt.
  • Performed pre-sales demonstrations, customer training and support, sales, and business development.
  • Managed a team of solver developers. I was responsible for the definition and execution of projects, yearly employee reviews, and career planning of the direct reports.
Technologies: Qt, PyQt, Ubuntu, Windows, Microsoft Foundation Classes (MFC), Microsoft Foundation Class (MFC) Library, Graphical User Interface (GUI), Fortran, C++, Python

Founder | Owner

2008 - 2014
Simufact-Americas, LLC
  • Founded a company for the resale of manufacturing simulation software that I co-developed.
  • Achieved a 20-fold increase in revenue for the Americas region.
  • Used Python and web development to automate business processes.
  • Created pre-sales, sales, and post-sales onboarding processes.
Technologies: Qt, PyQt, Fortran, Windows, Python, Django

Experience

Platform Engineer | Healthcare Data Management

https://onoshealth.com/
I worked as a platform engineer building and maintaining AWS infrastructure for a healthcare technology platform. My responsibilities included designing Terraform modules for multi-environment deployments (development, staging, and production), developing Python Lambda functions for operational alerting, implementing secure secrets management with AWS Secrets Manager, and building CI/CD pipelines with GitHub Actions.

Key focus areas: Infrastructure-as-code, observability systems, and security-first cloud architecture.

funnAI

funnAI is the next-generation AI ecosystem and first-ever implementation of the Proof-of-AI-Work (PoAIW) Protocol—a new consensus model where AI models compete, validate, and earn rewards in a fully decentralized environment.

icpp-pro: C++ Platform for the Internet Computer

icpp-pro is a comprehensive C++ development toolkit for the Internet Computer, designed to support the full on-chain execution of llama.cpp large language models as the core engine for AI agents.

It has 65,000 downloads from PyPI.

Llama.cpp for the Internet Computer

https://github.com/onicai/llama_cpp_canister
llama_cpp_canister is a fully on-chain large language model (LLM) smart contract for the Internet Computer. It compiles llama.cpp into WebAssembly, allowing quantized GGUF models (such as Qwen and DeepSeek) to run entirely on-chain without relying on off-chain compute. I developed the C++ core, built Python tools for model upload and caching, and implemented a complete CI/CD pipeline for WebAssembly builds, model compression, and automated testing.

Student Application Prediction

I developed a data pipeline and a machine learning model to predict university student application probability based on time-stamped engagements.

The data pipeline extracted millions of records from several SQL databases and generated features for the machine learning model. The end result of the data pipeline was a pandas DataFrame written to an S3 bucket. The machine learning pipeline loaded the pandas DataFrame and trained a custom TensorFlow model used by the university admissions department to identify the most promising prospective applicants.

Chatbot for an Expert System

I developed a chatbot with Rasa and Elasticsearch and deployed it to a single-node Kubernetes cluster on AWS.

The chatbot provides an alternative interface to a web-based expert system. The data pipeline creates word and sentence embeddings from web-scraped data and injects them into an Elasticsearch database. The embeddings are created using pre-trained machine learning models from TensorFlow Hub. The chatbot listens to users' questions and finds the most similar results by querying Elasticsearch.

Front-end Design and Implementation of a Windows Desktop Application

An industrially proven software package for the computer simulation of industrial forging processes. It combined a familiar and intuitive Windows graphical user interface with a robust solution procedure to provide unprecedented accuracy and speed in forging simulations.

I was both a solver and a front-end developer.

Education

1982 - 1988

Master's Degree in Aerospace Engineering (CFD)

Delft University of Technology - Delft, the Netherlands

Certifications

AUGUST 2021 - AUGUST 2024

AWS Certified Cloud Practitioner

Amazon Web Services

MARCH 2021 - PRESENT

Nanodegree, Cloud DevOps Engineer

Udacity

JANUARY 2020 - PRESENT

Google Kubernetes Engine

Google via Coursera

JANUARY 2019 - PRESENT

Deep Learning

Deeplearning.ai via Coursera

SEPTEMBER 2018 - PRESENT

MongoDB for Developers

MongoDB University

MAY 2018 - PRESENT

Nanodegree, Full-stack Web Development

Udacity

DECEMBER 2017 - PRESENT

Nanodegree, Self-driving Car Engineer

Udacity

JANUARY 2016 - PRESENT

Retrieving, Processing, and Visualizing Data with Python

Coursera

Skills

Libraries/APIs

NumPy, Pandas, llama.cpp, Microsoft Foundation Class (MFC) Library, Scikit-learn, TensorFlow, Matplotlib, Keras, Python Asyncio, Newman, Microsoft Foundation Classes (MFC), XGBoost, PyQt

Tools

Rasa.ai, GitHub, Docker Compose, Claude, ChatGPT, Terraform, Google Kubernetes Engine (GKE), Helm, GitLab, Amazon EKS, AWS CloudFormation, Amazon Virtual Private Cloud (VPC), Postman, Confluence, Jira, Jupyter, Bitbucket, Zoom, Slack, CircleCI, Ansible, PyPI, Pytest, Celery

Languages

C++, Python, SQL, Fortran, JavaScript, C++20, TypeScript

Frameworks

Django, Qt, Flask, AWS HA

Paradigms

Automation, DevOps, Agile, ETL, REST

Platforms

Docker, Amazon Web Services (AWS), Jupyter Notebook, Kubernetes, Ubuntu, Google Cloud Platform (GCP), Heroku, Windows, Linux, Amazon EC2, Zendesk, AWS Lambda

Storage

Elasticsearch, Amazon S3 (AWS S3), Google Cloud, Data Pipelines, MongoDB, PostgreSQL

Other

Chatbots, Natural Language Processing (NLP), Machine Learning, CI/CD Pipelines, Generative Pre-trained Transformers (GPT), Typer, Artificial Intelligence (AI), Open-source LLMs, Hugging Face, Large Language Models (LLMs), Crypto, Anthropic, APIs, Agentic AI, OpenAI, OpenAI GPT-4 API, Data Engineering, Full-stack, GitHub Actions, Graphical User Interface (GUI), FastAPI, Webhooks, WebAssembly (Wasm), Retrieval-augmented Generation (RAG), Motoko, Open Source, Data Science, Data Scraping, Gemini, caffeine.ai, Back-end, AWS ECS Fargate, Coderabbit, AWS Secrets Manager

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring